Search Results for author: Rémi Tachet des Combes

Found 6 papers, 4 papers with code

A single gradient step finds adversarial examples on random two-layers neural networks

no code implementations NeurIPS 2021 Sébastien Bubeck, Yeshwanth Cherapanamjeri, Gauthier Gidel, Rémi Tachet des Combes

Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks.

Safe Policy Improvement with an Estimated Baseline Policy

no code implementations11 Sep 2019 Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes

Previous work has shown the unreliability of existing algorithms in the batch Reinforcement Learning setting, and proposed the theoretically-grounded Safe Policy Improvement with Baseline Bootstrapping (SPIBB) fix: reproduce the baseline policy in the uncertain state-action pairs, in order to control the variance on the trained policy performance.

Management

Safe Policy Improvement with Soft Baseline Bootstrapping

2 code implementations11 Jul 2019 Kimia Nadjahi, Romain Laroche, Rémi Tachet des Combes

Batch Reinforcement Learning (Batch RL) consists in training a policy using trajectories collected with another policy, called the behavioural policy.

Safe Policy Improvement with Baseline Bootstrapping

2 code implementations19 Dec 2017 Romain Laroche, Paul Trichelair, Rémi Tachet des Combes

Finally, we implement a model-free version of SPIBB and show its benefits on a navigation task with deep RL implementation called SPIBB-DQN, which is, to the best of our knowledge, the first RL algorithm relying on a neural network representation able to train efficiently and reliably from batch data, without any interaction with the environment.

Cannot find the paper you are looking for? You can Submit a new open access paper.