🌱

Melvin's digital garden

APPL: Approximate POMDP planning toolkit

speaker: Lim Zhan Wei event: Singapore self driving car meetup

POMDP model

states s
actions a
observations o
state transition function p(s'|s, a)
belief b(s) is a probability distribution over the states
policy is a mapping from a belief to an action
- an optimal policy maximizes the expected total reward

Main issues

curse of dimensionality
- large state space
- large belief space
curse of history
- long planning horizon

Reduce the problem by sampling the states and observations

DESPOT selects K scenarios

choose the best action for the K scenarios

APPL toolkit

DESPOT is an online solver for discrete or continuous POMDP
two other methods for offline solving

Links to this note

Literature notes