Search Results for author: Aurélien Garivier

Found 35 papers, 3 papers with code

Beyond Average Return in Markov Decision Processes

no code implementations NeurIPS 2023 Alexandre Marthe, Aurélien Garivier, Claire Vernade

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics.

Distributional Reinforcement Learning

About the Cost of Central Privacy in Density Estimation

no code implementations26 Jun 2023 Clément Lalanne, Aurélien Garivier, Rémi Gribonval

We recover the result of Barber \& Duchi (2014) stating that histogram estimators are optimal against Lipschitz distributions for the L2 risk, and under regular differential privacy, and we extend it to other norms and notions of privacy.

Density Estimation

Private Statistical Estimation of Many Quantiles

no code implementations14 Feb 2023 Clément Lalanne, Aurélien Garivier, Rémi Gribonval

The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution.

Density Estimation

Regret Analysis of the Stochastic Direct Search Method for Blind Resource Allocation

no code implementations11 Oct 2022 Juliette Achddou, Olivier Cappe, Aurélien Garivier

This is a realistic model for budget allocation across subdivisions of marketing campaigns, when the objective is to maximize the number of conversions.

Marketing

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

no code implementations5 Oct 2022 Clément Lalanne, Aurélien Garivier, Rémi Gribonval

In certain scenarios, we show that maintaining privacy results in a noticeable reduction in performance only when the level of privacy protection is very high.

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

no code implementations30 Sep 2022 Antoine Barrier, Aurélien Garivier, Gilles Stoltz

All these new upper and lower bounds generalize existing bounds based, e. g., on gaps between distributions.

Multi-Armed Bandits

Private Quantiles Estimation in the Presence of Atoms

no code implementations15 Feb 2022 Clément Sébastien Lalanne, Clément Gastaud, Nicolas Grislain, Aurélien Garivier, Rémi Gribonval

We consider the differentially private estimation of multiple quantiles (MQ) of a distribution from a dataset, a key building block in modern data analysis.

On the complexity of All $\varepsilon$-Best Arms Identification

no code implementations13 Feb 2022 Aymen Al Marjani, Tomáš Kocák, Aurélien Garivier

Our method is based on a complete characterization of the alternative bandit instances that the optimal sampling strategy needs to rule out, thus making our bound tighter than the one provided by \cite{Mason2020}.

Sequential Algorithms for Testing Closeness of Distributions

no code implementations NeurIPS 2021 Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

For a general alphabet size $n$, we give a sequential algorithm that uses no more samples than its batch counterpart, and possibly fewer if the actual distance between $\mathcal{D}_1$ and $\mathcal{D}_2$ is larger than $\epsilon$.

A/B/n Testing with Control in the Presence of Subpopulations

no code implementations NeurIPS 2021 Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen

At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.

Fast Rate Learning in Stochastic First Price Bidding

no code implementations5 Jul 2021 Juliette Achddou, Olivier Cappé, Aurélien Garivier

First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.

Navigating to the Best Policy in Markov Decision Processes

no code implementations NeurIPS 2021 Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

no code implementations27 May 2021 Antoine Barrier, Aurélien Garivier, Tomáš Kocák

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance.

Efficient Algorithms for Stochastic Repeated Second-price Auctions

no code implementations10 Nov 2020 Juliette Achddou, Olivier Cappé, Aurélien Garivier

We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies.

Marketing

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

no code implementations2 Nov 2020 Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier

Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them.

Best Arm Identification in Spectral Bandits

no code implementations20 May 2020 Tomáš Kocák, Aurélien Garivier

We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint.

Algorithms for Non-Stationary Generalized Linear Bandits

no code implementations23 Mar 2020 Yoan Russac, Olivier Cappé, Aurélien Garivier

The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings.

X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

no code implementations17 Apr 2019 Léonard Torossian, Aurélien Garivier, Victor Picheny

We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.

Decision Making

A Review on Quantile Regression for Stochastic Computer Experiments

no code implementations23 Jan 2019 Léonard Torossian, Victor Picheny, Robert Faivre, Aurélien Garivier

We report on an empirical study of the main strategies for quantile regression in the context of stochastic computer experiments.

regression

Optimization of a SSP's Header Bidding Strategy using Thompson Sampling

no code implementations9 Jul 2018 Grégoire Jauvion, Nicolas Grislain, Pascal Sielenou Dkengne, Aurélien Garivier, Sébastien Gerchinovitz

The SSP acts as an intermediary between an advertiser wanting to buy ad spaces and a web publisher wanting to sell its ad spaces, and needs to define a bidding strategy to be able to deliver to the advertisers as many ads as possible while spending as little as possible.

Thompson Sampling

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

1 code implementation14 May 2018 Aurélien Garivier, Hédi Hadiji, Pierre Menard, Gilles Stoltz

We were able to obtain this non-parametric bi-optimality result while working hard to streamline the proofs (of previously known regret bounds and thus of the new analyses carried out); a second merit of the present contribution is therefore to provide a review of proofs of classical regret bounds for index-based strategies for $K$-armed stochastic bandits.

Profitable Bandits

no code implementations8 May 2018 Mastane Achab, Stephan Clémençon, Aurélien Garivier

We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling.

Management Thompson Sampling

Thresholding Bandit for Dose-ranging: The Impact of Monotonicity

no code implementations13 Nov 2017 Aurélien Garivier, Pierre Ménard, Laurent Rossi, Pierre Menard

We analyze the sample complexity of the thresholding bandit problem, with and without the assumption that the mean values of the arms are increasing.

valid

Max K-armed bandit: On the ExtremeHunter algorithm and beyond

no code implementations27 Jul 2017 Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade

This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values.

A minimax and asymptotically optimal algorithm for stochastic bandits

no code implementations23 Feb 2017 Pierre Ménard, Aurélien Garivier

We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions.

Learning the distribution with largest mean: two bandit frameworks

1 code implementation31 Jan 2017 Emilie Kaufmann, Aurélien Garivier

Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization.

BIG-bench Machine Learning Vocal Bursts Valence Prediction

On Explore-Then-Commit Strategies

no code implementations NeurIPS 2016 Aurélien Garivier, Emilie Kaufmann, Tor Lattimore

We study the problem of minimising regret in two-armed bandit problems with Gaussian rewards.

Explore First, Exploit Next: The True Shape of Regret in Bandit Problems

no code implementations23 Feb 2016 Aurélien Garivier, Pierre Ménard, Gilles Stoltz

We revisit lower bounds on the regret in the case of multi-armed bandit problems.

Optimal Best Arm Identification with Fixed Confidence

no code implementations15 Feb 2016 Aurélien Garivier, Emilie Kaufmann

We give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems.

Maximin Action Identification: A New Bandit Framework for Games

no code implementations15 Feb 2016 Aurélien Garivier, Emilie Kaufmann, Wouter Koolen

We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search.

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

no code implementations16 Jul 2014 Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning.

LEMMA

On the Complexity of A/B Testing

no code implementations13 May 2014 Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes.

The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

no code implementations12 Feb 2011 Aurélien Garivier, Olivier Cappé

This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems.

Parametric Bandits: The Generalized Linear Case

no code implementations NeurIPS 2010 Sarah Filippi, Olivier Cappe, Aurélien Garivier, Csaba Szepesvári

We consider structured multi-armed bandit tasks in which the agent is guided by prior structural knowledge that can be exploited to efficiently select the optimal arm(s) in situations where the number of arms is large, or even infinite.

Cannot find the paper you are looking for? You can Submit a new open access paper.