Search Results for author: Yannis Flet-Berliac

Found 10 papers, 2 papers with code

PASTA: Pretrained Action-State Transformer Agents

no code implementations20 Jul 2023 Raphael Boige, Yannis Flet-Berliac, Arthur Flajolet, Guillaume Richard, Thomas Pierrot

Self-supervised learning has brought about a revolutionary paradigm shift in various computing domains, including NLP, vision, and biology.

Language Modelling Masked Language Modeling +3

Model-based Offline Reinforcement Learning with Local Misspecification

no code implementations26 Jan 2023 Kefan Dong, Yannis Flet-Berliac, Allen Nie, Emma Brunskill

We present a model-based offline reinforcement learning policy performance lower bound that explicitly captures dynamics model misspecification and distribution mismatch and we propose an empirical algorithm for optimal offline policy selection.

D4RL reinforcement-learning +1

Data-Efficient Pipeline for Offline Reinforcement Learning with Limited Data

no code implementations16 Oct 2022 Allen Nie, Yannis Flet-Berliac, Deon R. Jordan, William Steenbergen, Emma Brunskill

Inspired by statistical model selection methods for supervised learning, we introduce a task- and method-agnostic pipeline for automatically training, comparing, selecting, and deploying the best policy when the provided dataset is limited in size.

Model Selection Offline RL +2

Offline Policy Optimization with Eligible Actions

1 code implementation1 Jul 2022 Yao Liu, Yannis Flet-Berliac, Emma Brunskill

Offline policy optimization could have a large impact on many real-world decision-making problems, as online learning may be infeasible in many applications.

Continuous Control Decision Making

SAAC: Safe Reinforcement Learning as an Adversarial Game of Actor-Critics

no code implementations20 Apr 2022 Yannis Flet-Berliac, Debabrota Basu

In SAAC, the adversary aims to break the safety constraint while the RL agent aims to maximize the constrained value function given the adversary's policy.

Continuous Control Decision Making +4

Adversarially Guided Actor-Critic

1 code implementation ICLR 2021 Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist

Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck.

Efficient Exploration

Learning Value Functions in Deep Policy Gradients using Residual Variance

no code implementations ICLR 2021 Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux

We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms.

Continuous Control Decision Making

MERL: Multi-Head Reinforcement Learning

no code implementations26 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this paper: (a) We introduce and define MERL, the multi-head reinforcement learning framework we use throughout this work.

Continuous Control reinforcement-learning +2

Samples Are Useful? Not Always: denoising policy gradient updates using variance explained

no code implementations25 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this work, Vex is used to evaluate the impact each transition will have on learning: this criterion refines sampling and improves the policy gradient algorithm.

Continuous Control Denoising

Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL

no code implementations8 Apr 2019 Yannis Flet-Berliac, Philippe Preux

In this work, we use this metric to select samples that are useful to learn from, and we demonstrate that this selection can significantly improve the performance of policy gradient methods.

Continuous Control Denoising +1

Cannot find the paper you are looking for? You can Submit a new open access paper.