no code implementations • NeurIPS 2019 • Chao Tao, Saùl Blanco, Jian Peng, Yuan Zhou
We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials.