Search Results for author: Aurélien Garivier

Found 35 papers, 3 papers with code

Beyond Average Return in Markov Decision Processes

no code implementations • NeurIPS 2023 • Alexandre Marthe, Aurélien Garivier, Claire Vernade

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics.

Distributional Reinforcement Learning

Paper
Add Code

About the Cost of Central Privacy in Density Estimation

no code implementations • 26 Jun 2023 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval

We recover the result of Barber \& Duchi (2014) stating that histogram estimators are optimal against Lipschitz distributions for the L2 risk, and under regular differential privacy, and we extend it to other norms and notions of privacy.

Density Estimation

Paper
Add Code

Private Statistical Estimation of Many Quantiles

no code implementations • 14 Feb 2023 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval

The first one consists in privately estimating the empirical quantiles of the samples and using this result as an estimator of the quantiles of the distribution.

Density Estimation

Paper
Add Code

Regret Analysis of the Stochastic Direct Search Method for Blind Resource Allocation

no code implementations • 11 Oct 2022 • Juliette Achddou, Olivier Cappe, Aurélien Garivier

This is a realistic model for budget allocation across subdivisions of marketing campaigns, when the objective is to maximize the number of conversions.

Marketing

Paper
Add Code

On the Statistical Complexity of Estimation and Testing under Privacy Constraints

no code implementations • 5 Oct 2022 • Clément Lalanne, Aurélien Garivier, Rémi Gribonval

In certain scenarios, we show that maintaining privacy results in a noticeable reduction in performance only when the level of privacy protection is very high.

Paper
Add Code

On Best-Arm Identification with a Fixed Budget in Non-Parametric Multi-Armed Bandits

no code implementations • 30 Sep 2022 • Antoine Barrier, Aurélien Garivier, Gilles Stoltz

All these new upper and lower bounds generalize existing bounds based, e. g., on gaps between distributions.

Multi-Armed Bandits

Paper
Add Code

Private Quantiles Estimation in the Presence of Atoms

no code implementations • 15 Feb 2022 • Clément Sébastien Lalanne, Clément Gastaud, Nicolas Grislain, Aurélien Garivier, Rémi Gribonval

We consider the differentially private estimation of multiple quantiles (MQ) of a distribution from a dataset, a key building block in modern data analysis.

Paper
Add Code

On the complexity of All $\varepsilon$-Best Arms Identification

no code implementations • 13 Feb 2022 • Aymen Al Marjani, Tomáš Kocák, Aurélien Garivier

Our method is based on a complete characterization of the alternative bandit instances that the optimal sampling strategy needs to rule out, thus making our bound tighter than the one provided by \cite{Mason2020}.

Paper
Add Code

Sequential Algorithms for Testing Closeness of Distributions

no code implementations • NeurIPS 2021 • Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

For a general alphabet size $n$, we give a sequential algorithm that uses no more samples than its batch counterpart, and possibly fewer if the actual distance between $\mathcal{D}_1$ and $\mathcal{D}_2$ is larger than $\epsilon$.

Paper
Add Code

A/B/n Testing with Control in the Presence of Subpopulations

no code implementations • NeurIPS 2021 • Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen

At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.

Paper
Add Code

Fast Rate Learning in Stochastic First Price Bidding

no code implementations • 5 Jul 2021 • Juliette Achddou, Olivier Cappé, Aurélien Garivier

First-price auctions have largely replaced traditional bidding approaches based on Vickrey auctions in programmatic advertising.

Paper
Add Code

Navigating to the Best Policy in Markov Decision Processes

no code implementations • NeurIPS 2021 • Aymen Al Marjani, Aurélien Garivier, Alexandre Proutiere

We investigate the classical active pure exploration problem in Markov Decision Processes, where the agent sequentially selects actions and, from the resulting system trajectory, aims at identifying the best policy as fast as possible.

Paper
Add Code

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

no code implementations • 27 May 2021 • Antoine Barrier, Aurélien Garivier, Tomáš Kocák

We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance.

Paper
Add Code

Efficient Algorithms for Stochastic Repeated Second-price Auctions

no code implementations • 10 Nov 2020 • Juliette Achddou, Olivier Cappé, Aurélien Garivier

We further provide the first parametric lower bound for this problem that applies to generic UCB-like strategies.

Marketing

Paper
Add Code

Self-Concordant Analysis of Generalized Linear Bandits with Forgetting

no code implementations • 2 Nov 2020 • Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier

Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them.

Paper
Add Code

Hierarchical and Unsupervised Graph Representation Learning with Loukas's Coarsening

1 code implementation • 7 Jul 2020 • Louis Béthune, Yacouba Kaloga, Pierre Borgnat, Aurélien Garivier, Amaury Habrard

We propose a novel algorithm for unsupervised graph representation learning with attributed graphs.

Graph Representation Learning

Paper
Code

Best Arm Identification in Spectral Bandits

no code implementations • 20 May 2020 • Tomáš Kocák, Aurélien Garivier

We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint.

Paper
Add Code

Algorithms for Non-Stationary Generalized Linear Bandits

no code implementations • 23 Mar 2020 • Yoan Russac, Olivier Cappé, Aurélien Garivier

The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings.

Paper
Add Code

X-Armed Bandits: Optimizing Quantiles, CVaR and Other Risks

no code implementations • 17 Apr 2019 • Léonard Torossian, Aurélien Garivier, Victor Picheny

We finally present numerical experiments that show a dramatic impact of tight bounds for the optimization of quantiles and CVaR.

Decision Making

Paper
Add Code

A Review on Quantile Regression for Stochastic Computer Experiments

no code implementations • 23 Jan 2019 • Léonard Torossian, Victor Picheny, Robert Faivre, Aurélien Garivier

We report on an empirical study of the main strategies for quantile regression in the context of stochastic computer experiments.

regression

Paper
Add Code

Optimization of a SSP's Header Bidding Strategy using Thompson Sampling

no code implementations • 9 Jul 2018 • Grégoire Jauvion, Nicolas Grislain, Pascal Sielenou Dkengne, Aurélien Garivier, Sébastien Gerchinovitz

The SSP acts as an intermediary between an advertiser wanting to buy ad spaces and a web publisher wanting to sell its ad spaces, and needs to define a bidding strategy to be able to deliver to the advertisers as many ads as possible while spending as little as possible.

Thompson Sampling

Paper
Add Code

KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints

1 code implementation • 14 May 2018 • Aurélien Garivier, Hédi Hadiji, Pierre Menard, Gilles Stoltz

We were able to obtain this non-parametric bi-optimality result while working hard to streamline the proofs (of previously known regret bounds and thus of the new analyses carried out); a second merit of the present contribution is therefore to provide a review of proofs of classical regret bounds for index-based strategies for $K$-armed stochastic bandits.

373

Paper
Code

Profitable Bandits

no code implementations • 8 May 2018 • Mastane Achab, Stephan Clémençon, Aurélien Garivier

We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling.

Management Thompson Sampling

Paper
Add Code

Thresholding Bandit for Dose-ranging: The Impact of Monotonicity

no code implementations • 13 Nov 2017 • Aurélien Garivier, Pierre Ménard, Laurent Rossi, Pierre Menard

We analyze the sample complexity of the thresholding bandit problem, with and without the assumption that the mean values of the arms are increasing.

valid

Paper
Add Code

Max K-armed bandit: On the ExtremeHunter algorithm and beyond

no code implementations • 27 Jul 2017 • Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade

This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values.

Paper
Add Code

A minimax and asymptotically optimal algorithm for stochastic bandits

no code implementations • 23 Feb 2017 • Pierre Ménard, Aurélien Garivier

We propose the kl-UCB ++ algorithm for regret minimization in stochastic bandit models with exponential families of distributions.

Paper
Add Code

Learning the distribution with largest mean: two bandit frameworks

1 code implementation • 31 Jan 2017 • Emilie Kaufmann, Aurélien Garivier

Over the past few years, the multi-armed bandit model has become increasingly popular in the machine learning community, partly because of applications including online content optimization.

BIG-bench Machine Learning Vocal Bursts Valence Prediction

Paper
Code

On Explore-Then-Commit Strategies

no code implementations • NeurIPS 2016 • Aurélien Garivier, Emilie Kaufmann, Tor Lattimore

We study the problem of minimising regret in two-armed bandit problems with Gaussian rewards.

Paper
Add Code

Explore First, Exploit Next: The True Shape of Regret in Bandit Problems

no code implementations • 23 Feb 2016 • Aurélien Garivier, Pierre Ménard, Gilles Stoltz

We revisit lower bounds on the regret in the case of multi-armed bandit problems.

Paper
Add Code

Optimal Best Arm Identification with Fixed Confidence

no code implementations • 15 Feb 2016 • Aurélien Garivier, Emilie Kaufmann

We give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems.

Paper
Add Code

Maximin Action Identification: A New Bandit Framework for Games

no code implementations • 15 Feb 2016 • Aurélien Garivier, Emilie Kaufmann, Wouter Koolen

We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search.

Paper
Add Code

On the Complexity of Best Arm Identification in Multi-Armed Bandit Models

no code implementations • 16 Jul 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning.

LEMMA

Paper
Add Code

On the Complexity of A/B Testing

no code implementations • 13 May 2014 • Emilie Kaufmann, Olivier Cappé, Aurélien Garivier

A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes.

Paper
Add Code

The KL-UCB Algorithm for Bounded Stochastic Bandits and Beyond

no code implementations • 12 Feb 2011 • Aurélien Garivier, Olivier Cappé

This paper presents a finite-time analysis of the KL-UCB algorithm, an online, horizon-free index policy for stochastic bandit problems.

Paper
Add Code

Parametric Bandits: The Generalized Linear Case

no code implementations • NeurIPS 2010 • Sarah Filippi, Olivier Cappe, Aurélien Garivier, Csaba Szepesvári

We consider structured multi-armed bandit tasks in which the agent is guided by prior structural knowledge that can be exploited to efficiently select the optimal arm(s) in situations where the number of arms is large, or even infinite.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.