Search Results for author: Denis Yarats

Found 21 papers, 18 papers with code

Watch and Match: Supercharging Imitation with Regularized Optimal Transport

no code implementations • 30 Jun 2022 • Siddhant Haldar, Vaibhav Mathur, Denis Yarats, Lerrel Pinto

Our experiments on 20 visual control tasks across the DeepMind Control Suite, the OpenAI Robotics Suite, and the Meta-World Benchmark demonstrate an average of 7. 8X faster imitation to reach 90% of expert performance compared to prior state-of-the-art methods.

Imitation Learning

Paper
Add Code

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

1 code implementation • 1 Feb 2022 • Michael Laskin, Hao liu, Xue Bin Peng, Denis Yarats, Aravind Rajeswaran, Pieter Abbeel

We introduce Contrastive Intrinsic Control (CIC), an algorithm for unsupervised skill discovery that maximizes the mutual information between state-transitions and latent skill vectors.

Contrastive Learning reinforcement-learning +2

Paper
Code

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning

1 code implementation • 31 Jan 2022 • Denis Yarats, David Brandfonbrener, Hao liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto

In this work, we propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL.

Offline RL reinforcement-learning +1

Paper
Code

Automatic Data Augmentation for Generalization in Reinforcement Learning

1 code implementation • NeurIPS 2021 • Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus

Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments.

Data Augmentation reinforcement-learning +1

102

Paper
Code

URLB: Unsupervised Reinforcement Learning Benchmark

1 code implementation • 28 Oct 2021 • Michael Laskin, Denis Yarats, Hao liu, Kimin Lee, Albert Zhan, Kevin Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve a range of complex yet specific control tasks.

Continuous Control reinforcement-learning +2

321

Paper
Code

Imitation Learning from Pixel Observations for Continuous Control

no code implementations • 29 Sep 2021 • samuel cohen, Brandon Amos, Marc Peter Deisenroth, Mikael Henaff, Eugene Vinitsky, Denis Yarats

In this setting, we explore recipes for imitation learning based on adversarial learning and optimal transport.

Benchmarking Continuous Control +1

Paper
Add Code

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

7 code implementations • ICLR 2022 • Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control.

Ranked #5 on Unsupervised Reinforcement Learning on URLB (states, 2*10^6 frames)

Continuous Control Data Augmentation +3

398

Paper
Code

Reinforcement Learning with Prototypical Representations

1 code implementation • 22 Feb 2021 • Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto

Unfortunately, in RL, representation learning is confounded with the exploratory experience of the agent -- learning a useful representation requires diverse data, while effective exploration is only possible with coherent representations.

Ranked #1 on Unsupervised Reinforcement Learning on URLB (pixels, 10^5 frames)

Continuous Control reinforcement-learning +3

Paper
Code

On the model-based stochastic value gradient for continuous reinforcement learning

1 code implementation • 28 Aug 2020 • Brandon Amos, Samuel Stanton, Denis Yarats, Andrew Gordon Wilson

For over a decade, model-based reinforcement learning has been seen as a way to leverage control-based domain knowledge to improve the sample-efficiency of reinforcement learning agents.

Continuous Control Humanoid Control +4

Paper
Code

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

1 code implementation • NeurIPS 2021 • Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus

Our agent outperforms other baselines specifically designed to improve generalization in RL.

Data Augmentation reinforcement-learning +1

102

Paper
Code

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

4 code implementations • ICLR 2021 • Ilya Kostrikov, Denis Yarats, Rob Fergus

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.

Ranked #1 on Continuous Control on DeepMind Walker Walk (Images)

Atari Games 100k Continuous Control +4

398

Paper
Code

On the adequacy of untuned warmup for adaptive optimization

1 code implementation • 9 Oct 2019 • Jerry Ma, Denis Yarats

The stability of such algorithms is often improved with a warmup schedule for the learning rate.

Ranked #6 on Machine Translation on WMT2016 English-German

Image Classification Language Modelling +1

358

Paper
Code

Generalized Inner Loop Meta-Learning

3 code implementations • 3 Oct 2019 • Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala

Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem.

Meta-Learning reinforcement-learning +1

2,536

Paper
Code

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

3 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus

A promising approach is to learn a latent representation together with the control policy.

Image Reconstruction reinforcement-learning +2

207

Paper
Code

The Differentiable Cross-Entropy Method

1 code implementation • ICML 2020 • Brandon Amos, Denis Yarats

We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters.

BIG-bench Machine Learning Continuous Control +1

124

Paper
Code

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation • NeurIPS 2019 • Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

159

Paper
Code

Quasi-hyperbolic momentum and Adam for deep learning

2 code implementations • ICLR 2019 • Jerry Ma, Denis Yarats

Momentum-based acceleration of stochastic gradient descent (SGD) is widely used in deep learning.

Stochastic Optimization

Paper
Code

Hierarchical Text Generation and Planning for Strategic Dialogue

1 code implementation • ICML 2018 • Denis Yarats, Mike Lewis

End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors.

Decision Making reinforcement-learning +3

1,375

Paper
Code

Deal or No Deal? End-to-End Learning of Negotiation Dialogues

no code implementations • EMNLP 2017 • Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

Paper
Add Code

Deal or No Deal? End-to-End Learning for Negotiation Dialogues

2 code implementations • 16 Jun 2017 • Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra

Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.

1,375

Paper
Code

Convolutional Sequence to Sequence Learning

37 code implementations • ICML 2017 • Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.

Ranked #3 on Bangla Spelling Error Correction on DPCSpell-Bangla-SEC-Corpus

Bangla Spelling Error Correction Image Classification +2

29,224

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.