Search Results for author: Etienne Boursier

Found 16 papers, 6 papers with code

Incentivized Learning in Principal-Agent Bandit Games

no code implementations6 Mar 2024 Antoine Scheid, Daniil Tiapkin, Etienne Boursier, Aymeric Capitaine, El Mahdi El Mhamdi, Eric Moulines, Michael I. Jordan, Alain Durmus

This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.

Early alignment in two-layer networks training is a two-edged sword

1 code implementation19 Jan 2024 Etienne Boursier, Nicolas Flammarion

Training neural networks with first order optimisation methods is at the core of the empirical success of deep learning.

Approximate information maximization for bandit games

no code implementations19 Oct 2023 Alex Barbier-Chebbah, Christian L. Vestergaard, Jean-Baptiste Masson, Etienne Boursier

Built on this principle, we propose a new class of bandit algorithms that maximize an approximation to the information of a key variable within the system.

Decision Making

Constant or logarithmic regret in asynchronous multiplayer bandits

no code implementations31 May 2023 Hugo Richard, Etienne Boursier, Vianney Perchet

This motivates the harder, asynchronous multiplayer bandits problem, which was first tackled with an explore-then-commit (ETC) algorithm (see Dakdouk, 2022), with a regret upper-bound in $\mathcal{O}(T^{\frac{2}{3}})$.

First-order ANIL learns linear representations despite misspecified latent dimension

no code implementations2 Mar 2023 Oğuz Kaan Yuksel, Etienne Boursier, Nicolas Flammarion

In particular, model-agnostic methods look for initialisation points from which gradient descent quickly adapts to any new task.

Meta-Learning

A survey on multi-player bandits

no code implementations29 Nov 2022 Etienne Boursier, Vianney Perchet

Due mostly to its application to cognitive radio networks, multiplayer bandits gained a lot of interest in the last decade.

Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs

1 code implementation2 Jun 2022 Etienne Boursier, Loucas Pillaud-Vivien, Nicolas Flammarion

The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution.

Trace norm regularization for multi-task learning with scarce data

1 code implementation14 Feb 2022 Etienne Boursier, Mikhail Konobeev, Nicolas Flammarion

Multi-task learning leverages structural similarities between multiple tasks to learn despite very few samples.

Meta-Learning Multi-Task Learning

Decentralized Learning in Online Queuing Systems

no code implementations NeurIPS 2021 Flore Sentenac, Etienne Boursier, Vianney Perchet

In the centralized case, the number of accumulated packets remains bounded (i. e., the system is \textit{stable}) as long as the ratio between service rates and arrival rates is larger than $1$.

Making the most of your day: online learning for optimal allocation of time

1 code implementation NeurIPS 2021 Etienne Boursier, Tristan Garrec, Vianney Perchet, Marco Scarsini

If she accepts the proposal, she is busy for the duration of the task and obtains a reward that depends on the task duration.

Scheduling

Social Learning in Non-Stationary Environments

no code implementations20 Jul 2020 Etienne Boursier, Vianney Perchet, Marco Scarsini

In the simple uni-dimensional and static setting, beliefs about the quality are known to converge to its true value.

Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

no code implementations NeurIPS 2020 Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko

In CMAB, the question of the existence of an efficient policy with an optimal asymptotic regret (up to a factor poly-logarithmic with the action size) is still open for many families of distributions, including mutually independent outcomes, and more generally the multivariate sub-Gaussian family.

Thompson Sampling

Utility/Privacy Trade-off through the lens of Optimal Transport

1 code implementation27 May 2019 Etienne Boursier, Vianney Perchet

Strategic information is valuable either by remaining private (for instance if it is sensitive) or, on the other hand, by being used publicly to increase some utility.

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

no code implementations4 Feb 2019 Etienne Boursier, Emilie Kaufmann, Abbas Mehrabian, Vianney Perchet

We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm, a collision occurs and the involved players receive zero reward.

Open-Ended Question Answering

SIC-MMAB: Synchronisation Involves Communication in Multiplayer Multi-Armed Bandits

1 code implementation NeurIPS 2019 Etienne Boursier, Vianney Perchet

Motivated by cognitive radio networks, we consider the stochastic multiplayer multi-armed bandit problem, where several players pull arms simultaneously and collisions occur if one of them is pulled by several players at the same stage.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.