no code implementations • NeurIPS 2021 • Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen
At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.
no code implementations • 5 Jul 2021 • Juliette Achddou, Olivier Cappé, Aurélien Garivier
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
1 code implementation • 21 Jun 2021 • Dorian Baudry, Yoan Russac, Olivier Cappé
There has been a recent surge of interest in nonparametric bandit algorithms based on subsampling.
no code implementations • 10 Nov 2020 • Juliette Achddou, Olivier Cappé, Aurélien Garivier
We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies.
no code implementations • 2 Nov 2020 • Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier
Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them.
1 code implementation • 23 Jun 2020 • Louis Filstroff, Olivier Gouvert, Cédric Févotte, Olivier Cappé
Non-negative matrix factorization (NMF) has become a well-established class of methods for the analysis of non-negative data.
no code implementations • 23 Mar 2020 • Yoan Russac, Olivier Cappé, Aurélien Garivier
The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings.
1 code implementation • NeurIPS 2019 • Yoan Russac, Claire Vernade, Olivier Cappé
To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past.
no code implementations • 28 Jun 2017 • Claire Vernade, Olivier Cappé, Vianney Perchet
We assume that the probability of conversion associated with each action is unknown while the distribution of the conversion delay is known, distinguishing between the (idealized) case where the conversion events may be observed whatever their delay and the more realistic setting in which late conversions are censored.
no code implementations • NeurIPS 2016 • Paul Lagrée, Claire Vernade, Olivier Cappé
Sequentially learning to place items in multi-position displays or lists is a task that can be cast into the multiple-play semi-bandit setting.
no code implementations • 4 Mar 2016 • Hossein Vahabi, Paul Lagrée, Claire Vernade, Olivier Cappé
In many web applications, a recommendation is not a single item suggested to a user but a list of possibly interesting contents that may be ranked in some contexts.
no code implementations • 30 Sep 2015 • Claire Vernade, Olivier Cappé
Recommending items to users is a challenging task due to the large amount of missing information.
no code implementations • 16 Jul 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning.
no code implementations • 13 May 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes.
no code implementations • 12 Feb 2011 • Aurélien Garivier, Olivier Cappé
This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems.
no code implementations • 27 Dec 2007 • Olivier Cappé, Eric Moulines
The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i. e., that of the maximum likelihood estimator.