Melvin's digital garden

APPL: Approximate POMDP planning toolkit

speaker: Lim Zhan Wei event: Singapore self driving car meetup

POMDP model

  • states s
  • actions a
  • observations o
  • state transition function p(s'|s, a)
  • belief b(s) is a probability distribution over the states
  • policy is a mapping from a belief to an action
    • an optimal policy maximizes the expected total reward

Main issues

  • curse of dimensionality
    • large state space
    • large belief space
  • curse of history
    • long planning horizon

Reduce the problem by sampling the states and observations

DESPOT selects K scenarios

  • choose the best action for the K scenarios

APPL toolkit

  • DESPOT is an online solver for discrete or continuous POMDP
  • two other methods for offline solving

Links to this note