no code implementations • NeurIPS 2023 • Alexandre Marthe, Aurélien Garivier, Claire Vernade
What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics.
no code implementations • 26 Jun 2023 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval
We recover the result of Barber \& Duchi (2014) stating that histogram estimators are optimal against Lipschitz distributions for the L2 risk, and under regular differential privacy, and we extend it to other norms and notions of privacy.
no code implementations • 14 Feb 2023 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval
The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution.
no code implementations • 11 Oct 2022 • Juliette Achddou, Olivier Cappe, Aurélien Garivier
This is a realistic model for budget allocation across subdivisions of marketing campaigns, when the objective is to maximize the number of conversions.
no code implementations • 5 Oct 2022 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval
In certain scenarios, we show that maintaining privacy results in a noticeable reduction in performance only when the level of privacy protection is very high.
no code implementations • 30 Sep 2022 • Antoine Barrier, Aurélien Garivier, Gilles Stoltz
All these new upper and lower bounds generalize existing bounds based, e. g., on gaps between distributions.
no code implementations • 15 Feb 2022 • Clément Sébastien Lalanne, Clément Gastaud, Nicolas Grislain, Aurélien Garivier, Rémi Gribonval
We consider the differentially private estimation of multiple quantiles (MQ) of a distribution from a dataset, a key building block in modern data analysis.
no code implementations • 13 Feb 2022 • Aymen Al Marjani, Tomáš Kocák, Aurélien Garivier
Our method is based on a complete characterization of the alternative bandit instances that the optimal sampling strategy needs to rule out, thus making our bound tighter than the one provided by \cite{Mason2020}.
no code implementations • NeurIPS 2021 • Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier
For a general alphabet size $n$, we give a sequential algorithm that uses no more samples than its batch counterpart, and possibly fewer if the actual distance between $\mathcal{D}_1$ and $\mathcal{D}_2$ is larger than $\epsilon$.
no code implementations • NeurIPS 2021 • Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen
At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.
no code implementations • 5 Jul 2021 • Juliette Achddou, Olivier Cappé, Aurélien Garivier
First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.
no code implementations • NeurIPS 2021 • Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere
We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.
no code implementations • 27 May 2021 • Antoine Barrier, Aurélien Garivier, Tomáš Kocák
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance.
no code implementations • 10 Nov 2020 • Juliette Achddou, Olivier Cappé, Aurélien Garivier
We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies.
no code implementations • 2 Nov 2020 • Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier
Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them.
1 code implementation • 7 Jul 2020 • Louis Béthune, Yacouba Kaloga, Pierre Borgnat, Aurélien Garivier, Amaury Habrard
We propose a novel algorithm for unsupervised graph representation learning with attributed graphs.
no code implementations • 20 May 2020 • Tomáš Kocák, Aurélien Garivier
We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint.
no code implementations • 23 Mar 2020 • Yoan Russac, Olivier Cappé, Aurélien Garivier
The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings.
no code implementations • 17 Apr 2019 • Léonard Torossian, Aurélien Garivier, Victor Picheny
We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.
no code implementations • 23 Jan 2019 • Léonard Torossian, Victor Picheny, Robert Faivre, Aurélien Garivier
We report on an empirical study of the main strategies for quantile regression in the context of stochastic computer experiments.
no code implementations • 9 Jul 2018 • Grégoire Jauvion, Nicolas Grislain, Pascal Sielenou Dkengne, Aurélien Garivier, Sébastien Gerchinovitz
The SSP acts as an intermediary between an advertiser wanting to buy ad spaces and a web publisher wanting to sell its ad spaces, and needs to define a bidding strategy to be able to deliver to the advertisers as many ads as possible while spending as little as possible.
1 code implementation • 14 May 2018 • Aurélien Garivier, Hédi Hadiji, Pierre Menard, Gilles Stoltz
We were able to obtain this non-parametric bi-optimality result while working hard to streamline the proofs (of previously known regret bounds and thus of the new analyses carried out); a second merit of the present contribution is therefore to provide a review of proofs of classical regret bounds for index-based strategies for $K$-armed stochastic bandits.
no code implementations • 8 May 2018 • Mastane Achab, Stephan Clémençon, Aurélien Garivier
We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling.
no code implementations • 13 Nov 2017 • Aurélien Garivier, Pierre Ménard, Laurent Rossi, Pierre Menard
We analyze the sample complexity of the thresholding bandit problem, with and without the assumption that the mean values of the arms are increasing.
no code implementations • 27 Jul 2017 • Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade
This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values.
no code implementations • 23 Feb 2017 • Pierre Ménard, Aurélien Garivier
We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions.
1 code implementation • 31 Jan 2017 • Emilie Kaufmann, Aurélien Garivier
Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization.
no code implementations • NeurIPS 2016 • Aurélien Garivier, Emilie Kaufmann, Tor Lattimore
We study the problem of minimising regret in two-armed bandit problems with Gaussian rewards.
no code implementations • 23 Feb 2016 • Aurélien Garivier, Pierre Ménard, Gilles Stoltz
We revisit lower bounds on the regret in the case of multi-armed bandit problems.
no code implementations • 15 Feb 2016 • Aurélien Garivier, Emilie Kaufmann
We give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems.
no code implementations • 15 Feb 2016 • Aurélien Garivier, Emilie Kaufmann, Wouter Koolen
We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search.
no code implementations • 16 Jul 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning.
no code implementations • 13 May 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier
A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes.
no code implementations • 12 Feb 2011 • Aurélien Garivier, Olivier Cappé
This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems.
no code implementations • NeurIPS 2010 • Sarah Filippi, Olivier Cappe, Aurélien Garivier, Csaba Szepesvári
We consider structured multi-armed bandit tasks in which the agent is guided by prior structural knowledge that can be exploited to efficiently select the optimal arm(s) in situations where the number of arms is large, or even infinite.