no code implementations • 4 Jan 2021 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet
On the other hand, this heuristic performs reasonably well in practice and it even has sublinear, and even near-optimal, regret bounds in some very specific linear contextual and Bayesian bandit models.
no code implementations • 28 Dec 2020 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet
Continuously learning and leveraging the knowledge accumulated from prior tasks in order to improve future performance is a long standing machine learning problem.
1 code implementation • NeurIPS 2019 • Matthieu Jedor, Jonathan Louedec, Vianney Perchet
We introduce a new stochastic multi-armed bandit setting where arms are grouped inside ``ordered'' categories.