Melvin's digital garden

Multi-arm bandits

  • what is it? background.
  • applications
    • clinical trial
    • testing website design
    • recommendation: netflix artwork
    • picking a mode of travel
  • trade-off: exploit vs explore
  • strategy for fixed number of pulls
    • pure exploration, equivalent to A/B testing
    • greedy: explore a little, then just exploit the best one
    • e-greedy: with prob e explore, 1-e exploit
    • thompson sampling: demo

https://smpybandits.github.io/

Links to this note