Search Results for author: Jean Tarbouriech

Found 9 papers, 1 papers with code

Adaptive Multi-Goal Exploration

no code implementations • 23 Nov 2021 • Jean Tarbouriech, Omar Darwiche Domingues, Pierre Ménard, Matteo Pirotta, Michal Valko, Alessandro Lazaric

We introduce a generic strategy for provably efficient multi-goal exploration.

Paper
Add Code

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching

1 code implementation • ICML Workshop URL 2021 • Pierre-Alexandre Kamienny, Jean Tarbouriech, Sylvain Lamprier, Alessandro Lazaric, Ludovic Denoyer

Learning meaningful behaviors in the absence of reward is a difficult problem in reinforcement learning.

426

Paper
Code

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

no code implementations • NeurIPS 2021 • Jean Tarbouriech, Runlong Zhou, Simon S. Du, Matteo Pirotta, Michal Valko, Alessandro Lazaric

We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state.

Paper
Add Code

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs

no code implementations • NeurIPS 2020 • Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

We investigate the exploration of an unknown environment when no reward function is provided.

Paper
Add Code

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

no code implementations • NeurIPS 2021 • Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Active Model Estimation in Markov Decision Processes

no code implementations • 6 Mar 2020 • Jean Tarbouriech, Shubhanshu Shekhar, Matteo Pirotta, Mohammad Ghavamzadeh, Alessandro Lazaric

Using a number of simple domains with heterogeneous noise in their transitions, we show that our heuristic-based algorithm outperforms both our original algorithm and the maximum entropy algorithm in the small sample regime, while achieving similar asymptotic performance as that of the original algorithm.

Common Sense Reasoning Efficient Exploration

Paper
Add Code

Adversarial Attacks on Linear Contextual Bandits

no code implementations • NeurIPS 2020 • Evrard Garcelon, Baptiste Roziere, Laurent Meunier, Jean Tarbouriech, Olivier Teytaud, Alessandro Lazaric, Matteo Pirotta

In many of these domains, malicious agents may have incentives to attack the bandit algorithm to induce it to perform a desired behavior.

Multi-Armed Bandits Recommendation Systems

Paper
Add Code

No-Regret Exploration in Goal-Oriented Reinforcement Learning

no code implementations • ICML 2020 • Jean Tarbouriech, Evrard Garcelon, Michal Valko, Matteo Pirotta, Alessandro Lazaric

Many popular reinforcement learning problems (e. g., navigation in a maze, some Atari games, mountain car) are instances of the episodic setting under its stochastic shortest path (SSP) formulation, where an agent has to achieve a goal state while minimizing the cumulative cost.

Atari Games reinforcement-learning +1

Paper
Add Code

Active Exploration in Markov Decision Processes

no code implementations • 28 Feb 2019 • Jean Tarbouriech, Alessandro Lazaric

As the noise level is initially unknown, we need to trade off the exploration of the environment to estimate the noise and the exploitation of these estimates to compute a policy maximizing the accuracy of the mean predictions.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.