no code implementations • 31 May 2019 • Yi-Qi Hu, Yang Yu, Jun-Da Liao
We show theoretically that the ER-UCB has a regret upper bound $O\left(K \ln n\right)$ with independent feedbacks, which is as efficient as the classical UCB bandit.
AutoML