Search Results for author: Marc Abeille

Found 11 papers, 1 papers with code

Near-continuous time Reinforcement Learning for continuous state-action spaces

no code implementations • 6 Sep 2023 • Lorenzo Croissant, Marc Abeille, Bruno Bouchard

In addition, we consider a generic reward function and model the state dynamics according to a jump process with an arbitrary transition kernel on $\mathbb{R}^d$.

reinforcement-learning

Paper
Add Code

Jointly Efficient and Optimal Algorithms for Logistic Bandits

2 code implementations • 6 Jan 2022 • Louis Faury, Marc Abeille, Kwang-Sung Jun, Clément Calauzènes

Logistic Bandits have recently undergone careful scrutiny by virtue of their combined theoretical and practical relevance.

Computational Efficiency

Paper
Code

Regret Bounds for Generalized Linear Bandits under Parameter Drift

no code implementations • 9 Mar 2021 • Louis Faury, Yoan Russac, Marc Abeille, Clément Calauzènes

Generalized Linear Bandits (GLBs) are powerful extensions to the Linear Bandit (LB) setting, broadening the benefits of reward parametrization beyond linearity.

Paper
Add Code

Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits

no code implementations • 23 Oct 2020 • Marc Abeille, Louis Faury, Clément Calauzènes

It was shown by Faury et al. (2020) that the learning-theoretic difficulties of Logistic Bandits can be embodied by a large (sometimes prohibitively) problem-dependent constant $\kappa$, characterizing the magnitude of the reward's non-linearity.

Paper
Add Code

Real-Time Optimisation for Online Learning in Auctions

no code implementations • ICML 2020 • Lorenzo Croissant, Marc Abeille, Clément Calauzènes

In display advertising, a small group of sellers and bidders face each other in up to 10 12 auctions a day.

Paper
Add Code

Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

no code implementations • ICML 2020 • Marc Abeille, Alessandro Lazaric

We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting.

Paper
Add Code

Improved Optimistic Algorithms for Logistic Bandits

no code implementations • ICML 2020 • Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq

For logistic bandits, the frequentist regret guarantees of existing algorithms are $\tilde{\mathcal{O}}(\kappa \sqrt{T})$, where $\kappa$ is a problem-dependent constant.

Paper
Add Code

Thompson Sampling in Non-Episodic Restless Bandits

no code implementations • 12 Oct 2019 • Young Hun Jung, Marc Abeille, Ambuj Tewari

Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging.

Open-Ended Question Answering Thompson Sampling

Paper
Add Code

Improved Regret Bounds for Thompson Sampling in Linear Quadratic Control Problems

no code implementations • ICML 2018 • Marc Abeille, Alessandro Lazaric

Thompson sampling (TS) is an effective approach to trade off exploration and exploration in reinforcement learning.

Thompson Sampling

Paper
Add Code

Thompson Sampling for Linear-Quadratic Control Problems

no code implementations • 27 Mar 2017 • Marc Abeille, Alessandro Lazaric

Despite the empirical and theoretical success in a wide range of problems from multi-armed bandit to linear bandit, we show that when studying the frequentist regret TS in control problems, we need to trade-off the frequency of sampling optimistic parameters and the frequency of switches in the control policy.

Thompson Sampling

Paper
Add Code

Linear Thompson Sampling Revisited

no code implementations • 20 Nov 2016 • Marc Abeille, Alessandro Lazaric

We derive an alternative proof for the regret of Thompson sampling (\ts) in the stochastic linear bandit setting.

Thompson Sampling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.