Search Results for author: Pierre Clavier

Found 3 papers, 0 papers with code

VITS : Variational Inference Thomson Sampling for contextual bandits

no code implementations • 19 Jul 2023 • Pierre Clavier, Tom Huix, Alain Durmus

In this paper, we introduce and analyze a variant of the Thompson sampling (TS) algorithm for contextual bandits.

Multi-Armed Bandits Thompson Sampling +1

Paper
Add Code

Towards Minimax Optimality of Model-based Robust Reinforcement Learning

no code implementations • 10 Feb 2023 • Pierre Clavier, Erwan Le Pennec, Matthieu Geist

In this paper, we consider uncertainty sets defined with an $L_p$-ball (recovering the TV case), and study the sample complexity of \emph{any} planning algorithm (with high accuracy guarantee on the solution) applied to an empirical RMDP estimated using the generative model.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Robust Reinforcement Learning with Distributional Risk-averse formulation

no code implementations • 14 Jun 2022 • Pierre Clavier, Stéphanie Allassonière, Erwan Le Pennec

Robust Reinforcement Learning tries to make predictions more robust to changes in the dynamics or rewards of the system.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.