no code implementations • NeurIPS 2021 • Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen
At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.
1 code implementation • 18 Jun 2019 • Peter Grünwald, Rianne de Heide, Wouter Koolen
We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes.
no code implementations • 28 Nov 2018 • Emilie Kaufmann, Wouter Koolen
This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model.
no code implementations • NeurIPS 2018 • Emilie Kaufmann, Wouter Koolen, Aurelien Garivier
We develop refined non-asymptotic lower bounds, which show that optimality mandates very different sampling behavior for a low vs high true minimum.
no code implementations • NeurIPS 2017 • Emilie Kaufmann, Wouter Koolen
Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems.
no code implementations • 15 Feb 2016 • Aurélien Garivier, Emilie Kaufmann, Wouter Koolen
We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search.