no code implementations • 14 May 2021 • Ming Liang Ang, Eloise Y. Y. Lim, Joel Q. L. Chang
The multi-armed bandit (MAB) problem is a ubiquitous decision-making problem that exemplifies exploration-exploitation tradeoff.
Decision Making Thompson Sampling