2 code implementations • NeurIPS 2021 • Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao
Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal.
Ranked #1 on
Atari Games 100k
on Atari 100k
no code implementations • 1 Jan 2021 • Thanard Kurutach, Julia Peng, Yang Gao, Stuart Russell, Pieter Abbeel
Discrete representations have been key in enabling robots to plan at more abstract levels and solve temporally-extended tasks more efficiently for decades.
1 code implementation • NeurIPS 2020 • Younggyo Seo, Kimin Lee, Ignasi Clavera, Thanard Kurutach, Jinwoo Shin, Pieter Abbeel
Model-based reinforcement learning (RL) has shown great potential in various control tasks in terms of both sample-efficiency and final performance.
1 code implementation • NeurIPS 2020 • Scott Emmons, Ajay Jain, Michael Laskin, Thanard Kurutach, Pieter Abbeel, Deepak Pathak
To operate effectively in the real world, agents should be able to act from high-dimensional raw sensory input such as images and achieve diverse goals across long time-horizons.
1 code implementation • ICML 2020 • Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar
In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e. g., images obtained from self-supervised robot interaction.
no code implementations • 12 Nov 2019 • Tal Daniel, Thanard Kurutach, Aviv Tamar
In this work, we propose two variational methods for training VAEs for SSAD.
2 code implementations • 29 Oct 2019 • Yilin Wu, Wilson Yan, Thanard Kurutach, Lerrel Pinto, Pieter Abbeel
Second, instead of jointly learning both the pick and the place locations, we only explicitly learn the placing policy conditioned on random pick points.
no code implementations • 11 May 2019 • Angelina Wang, Thanard Kurutach, Kara Liu, Pieter Abbeel, Aviv Tamar
We further demonstrate our approach on learning to imagine and execute in 3 environments, the final of which is deformable rope manipulation on a PR2 robot.
1 code implementation • NeurIPS 2018 • Thanard Kurutach, Aviv Tamar, Ge Yang, Stuart Russell, Pieter Abbeel
Finally, to generate a visual plan, we project the current and goal observations onto their respective states in the planning model, plan a trajectory, and then use the generative model to transform the trajectory to a sequence of observations.
2 code implementations • ICLR 2018 • Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel
In this paper, we analyze the behavior of vanilla model-based reinforcement learning methods when deep neural networks are used to learn both the model and the policy, and show that the learned policy tends to exploit regions where insufficient data is available for the model to be learned, causing instability in training.
no code implementations • 2 Dec 2015 • Lawson L. S. Wong, Thanard Kurutach, Leslie Pack Kaelbling, Tomás Lozano-Pérez
We refer to this attribute-based representation as a world model, and consider how to acquire it via noisy perception and maintain it over time, as objects are added, changed, and removed in the world.