Multi-arm bandits
- what is it? background.
- applications
- clinical trial
- testing website design
- recommendation: netflix artwork
- picking a mode of travel
- trade-off: exploit vs explore
- strategy for fixed number of pulls
- pure exploration, equivalent to A/B testing
- greedy: explore a little, then just exploit the best one
- e-greedy: with prob e explore, 1-e exploit
- thompson sampling: demo