4 code implementations • ICLR 2021 • Tatsuya Matsushima, Hiroki Furuta, Yutaka Matsuo, Ofir Nachum, Shixiang Gu
We propose a novel model-based algorithm, Behavior-Regularized Model-ENsemble (BREMEN) that can effectively optimize a policy offline using 10-20 times fewer data than prior works.
1 code implementation • ICLR 2020 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
2 code implementations • 27 Apr 2020 • Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu
Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?
no code implementations • ICLR 2020 • Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard
This is a critical shortcoming for applying RL to real-world problems where collecting data is expensive, and models must be tested offline before being deployed to interact with the environment -- e. g. systems that learn from human interaction.
3 code implementations • 6 Nov 2019 • Seyed Kamyar Seyed Ghasemipour, Richard Zemel, Shixiang Gu
We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.
no code implementations • 23 Sep 2019 • Ofir Nachum, Haoran Tang, Xingyu Lu, Shixiang Gu, Honglak Lee, Sergey Levine
Hierarchical reinforcement learning has demonstrated significant success at solving difficult reinforcement learning (RL) tasks.
Hierarchical Reinforcement Learning reinforcement-learning +2
no code implementations • 13 Aug 2019 • Ofir Nachum, Michael Ahn, Hugo Ponte, Shixiang Gu, Vikash Kumar
Our method hinges on the use of hierarchical sim2real -- a simulated environment is used to learn low-level goal-reaching skills, which are then used as the action space for a high-level RL controller, also trained in simulation.
3 code implementations • 2 Jul 2019 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
1 code implementation • 30 Jun 2019 • Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard
Most deep reinforcement learning (RL) systems are not able to learn effectively from off-policy data, especially if they cannot explore online in the environment.
2 code implementations • NeurIPS 2019 • Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
We find that, using our approach, agents can learn to solve to diverse, temporally-extended tasks such as object sorting and multi-object rearrangement, including from raw pixel observations.
3 code implementations • ICLR 2019 • George Tucker, Dieterich Lawson, Shixiang Gu, Chris J. Maddison
Burda et al. (2015) introduced a multi-sample variational bound, IWAE, that is at least as tight as the standard variational lower bound and becomes increasingly tight as the number of samples increases.
7 code implementations • ICLR 2019 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning.
13 code implementations • NeurIPS 2018 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.
Hierarchical Reinforcement Learning reinforcement-learning +2
1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
no code implementations • ICLR 2018 • Vitchyr Pong, Shixiang Gu, Murtaza Dalal, Sergey Levine
TDMs combine the benefits of model-free and model-based RL: they leverage the rich information in state transitions to learn very efficiently, while still attaining asymptotic performance that exceeds that of direct model-based RL methods.
no code implementations • ICLR 2018 • Chen Wang, Xiangyu Chen, Zelin Ye, Jialu Wang, Ziruo Cai, Shixiang Gu, Cewu Lu
However, tasks with sparse rewards remain challenging when the state space is large.
1 code implementation • ICLR 2018 • Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine
In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.
no code implementations • NeurIPS 2017 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques.
no code implementations • ICML 2017 • Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck
This paper proposes a general method for improving the structure and quality of sequences generated by a recurrent neural network (RNN), while maintaining information originally learned from data, as well as sample diversity.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
20 code implementations • 3 Nov 2016 • Eric Jang, Shixiang Gu, Ben Poole
Categorical variables are a natural choice for representing discrete structure in the world.
1 code implementation • ICLR 2017 2016 • Eric Jang, Shixiang Gu, Ben Poole
Categorical variables are a natural choice for representing discrete structure in the world.
no code implementations • 3 Oct 2016 • Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine
In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
8 code implementations • 2 Mar 2016 • Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.
2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih
Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.
no code implementations • NeurIPS 2015 • Shixiang Gu, Zoubin Ghahramani, Richard E. Turner
Experiments indicate that NASMC significantly improves inference in a non-linear state space model outperforming adaptive proposal methods including the Extended Kalman and Unscented Particle Filters.
2 code implementations • 11 Dec 2014 • Shixiang Gu, Luca Rigazio
We perform various experiments to assess the removability of adversarial examples by corrupting with additional noise and pre-processing with denoising autoencoders (DAEs).