no code implementations • 30 Jun 2022 • Siddhant Haldar, Vaibhav Mathur, Denis Yarats, Lerrel Pinto
Our experiments on 20 visual control tasks across the DeepMind Control Suite, the OpenAI Robotics Suite, and the Meta-World Benchmark demonstrate an average of 7. 8X faster imitation to reach 90% of expert performance compared to prior state-of-the-art methods.
1 code implementation • 1 Feb 2022 • Michael Laskin, Hao liu, Xue Bin Peng, Denis Yarats, Aravind Rajeswaran, Pieter Abbeel
We introduce Contrastive Intrinsic Control (CIC), an algorithm for unsupervised skill discovery that maximizes the mutual information between state-transitions and latent skill vectors.
1 code implementation • 31 Jan 2022 • Denis Yarats, David Brandfonbrener, Hao liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto
In this work, we propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL.
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments.
1 code implementation • 28 Oct 2021 • Michael Laskin, Denis Yarats, Hao liu, Kimin Lee, Albert Zhan, Kevin Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel
Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve a range of complex yet specific control tasks.
no code implementations • 29 Sep 2021 • samuel cohen, Brandon Amos, Marc Peter Deisenroth, Mikael Henaff, Eugene Vinitsky, Denis Yarats
In this setting, we explore recipes for imitation learning based on adversarial learning and optimal transport.
8 code implementations • ICLR 2022 • Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control.
1 code implementation • 22 Feb 2021 • Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
Unfortunately, in RL, representation learning is confounded with the exploratory experience of the agent -- learning a useful representation requires diverse data, while effective exploration is only possible with coherent representations.
1 code implementation • 28 Aug 2020 • Brandon Amos, Samuel Stanton, Denis Yarats, Andrew Gordon Wilson
For over a decade, model-based reinforcement learning has been seen as a way to leverage control-based domain knowledge to improve the sample-efficiency of reinforcement learning agents.
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Our agent outperforms other baselines specifically designed to improve generalization in RL.
4 code implementations • ICLR 2021 • Ilya Kostrikov, Denis Yarats, Rob Fergus
We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.
Ranked #1 on Continuous Control on DeepMind Walker Walk (Images)
1 code implementation • 9 Oct 2019 • Jerry Ma, Denis Yarats
The stability of such algorithms is often improved with a warmup schedule for the learning rate.
Ranked #6 on Machine Translation on WMT2016 English-German
3 code implementations • 3 Oct 2019 • Edward Grefenstette, Brandon Amos, Denis Yarats, Phu Mon Htut, Artem Molchanov, Franziska Meier, Douwe Kiela, Kyunghyun Cho, Soumith Chintala
Many (but not all) approaches self-qualifying as "meta-learning" in deep learning and reinforcement learning fit a common pattern of approximating the solution to a nested optimization problem.
4 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus
A promising approach is to learn a latent representation together with the control policy.
1 code implementation • ICML 2020 • Brandon Amos, Denis Yarats
We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters.
1 code implementation • NeurIPS 2019 • Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis
We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.
2 code implementations • ICLR 2019 • Jerry Ma, Denis Yarats
Momentum-based acceleration of stochastic gradient descent (SGD) is widely used in deep learning.
1 code implementation • ICML 2018 • Denis Yarats, Mike Lewis
End-to-end models for goal-orientated dialogue are challenging to train, because linguistic and strategic aspects are entangled in latent state vectors.
no code implementations • EMNLP 2017 • Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, Dhruv Batra
Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.
3 code implementations • 16 Jun 2017 • Mike Lewis, Denis Yarats, Yann N. Dauphin, Devi Parikh, Dhruv Batra
Much of human dialogue occurs in semi-cooperative settings, where agents with different goals attempt to agree on common decisions.
37 code implementations • ICML 2017 • Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin
The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.