A curated list of resources about multi-armed bandit (MAB).
The multi-armed bandit problem is a classic reinforcement learning problem that exemplifies the exploration-exploitation tradeoff dilemma.
If you want to run the jupyter notebooks in this repository, read through the following instructions in this section. Otherwise, ignore this section. You can still view the result of the notebooks by following the links listed in corresponding README files.
$ conda create --name mab_env --file mab_env.txt
$ conda activate mab_env
$ jupyter-lab
Then open the url in your browser and enjoy the GUI offered by jupyter-lab!
- Multi-armed Bandit Allocation Indices
- Bandit Algorithms for Website Optimization
- Introduction to Multi-Armed Bandits
- The 1979 paper by Gittins: Bandit Processes and Dynamic Allocation Indices
- Scaling Multi-Armed Bandit Algorithms
- Multi-Armed Bandits with Correlated Arms
- Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
Get your hands dirty by doing some projects (simulations)!