Search Results for author: Rémi Tachet des Combes

Found 6 papers, 4 papers with code

Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting

1 code implementation • 22 Jun 2023 • Zhang-Wei Hong, Pulkit Agrawal, Rémi Tachet des Combes, Romain Laroche

This re-weighted sampling strategy may be combined with any offline RL algorithm.

Paper
Code

A single gradient step finds adversarial examples on random two-layers neural networks

no code implementations • NeurIPS 2021 • Sébastien Bubeck, Yeshwanth Cherapanamjeri, Gauthier Gidel, Rémi Tachet des Combes

Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks.

Paper
Add Code

Adversarial score matching and improved sampling for image generation

1 code implementation • ICLR 2021 • Alexia Jolicoeur-Martineau, Rémi Piché-Taillefer, Rémi Tachet des Combes, Ioannis Mitliagkas

Denoising Score Matching with Annealed Langevin Sampling (DSM-ALS) has recently found success in generative modeling.

Ranked #54 on Image Generation on CIFAR-10

Denoising Image Generation

121

Paper
Code

Safe Policy Improvement with an Estimated Baseline Policy

no code implementations • 11 Sep 2019 • Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes

Previous work has shown the unreliability of existing algorithms in the batch Reinforcement Learning setting, and proposed the theoretically-grounded Safe Policy Improvement with Baseline Bootstrapping (SPIBB) fix: reproduce the baseline policy in the uncertain state-action pairs, in order to control the variance on the trained policy performance.

Management

Paper
Add Code

Safe Policy Improvement with Soft Baseline Bootstrapping

2 code implementations • 11 Jul 2019 • Kimia Nadjahi, Romain Laroche, Rémi Tachet des Combes

Batch Reinforcement Learning (Batch RL) consists in training a policy using trajectories collected with another policy, called the behavioural policy.

Paper
Code

Safe Policy Improvement with Baseline Bootstrapping

2 code implementations • 19 Dec 2017 • Romain Laroche, Paul Trichelair, Rémi Tachet des Combes

Finally, we implement a model-free version of SPIBB and show its benefits on a navigation task with deep RL implementation called SPIBB-DQN, which is, to the best of our knowledge, the first RL algorithm relying on a neural network representation able to train efficiently and reliably from batch data, without any interaction with the environment.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.