no code implementations • 15 Jun 2021 • Long Yang, Zhao Li, Zehong Hu, Shasha Ruan, Shijian Li, Gang Pan, Hongyang Chen
In this paper, we propose a Thompson Sampling algorithm for \emph{unimodal} bandits, where the expected reward is unimodal over the partially ordered arms.