no code implementations • ICML 2020 • Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak
To solve complex tasks, intelligent agents first need to explore their environments.
no code implementations • 31 Jul 2023 • Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan
To interact with humans in the world, agents need to understand the diverse types of language that people use, relate them to the visual world, and act based on them.
no code implementations • 23 May 2023 • Alejandro Escontrela, Ademi Adeniji, Wilson Yan, Ajay Jain, Xue Bin Peng, Ken Goldberg, Youngwoon Lee, Danijar Hafner, Pieter Abbeel
A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet.
4 code implementations • 10 Jan 2023 • Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap
General intelligence requires solving tasks across many domains.
1 code implementation • 24 Oct 2022 • Jurgis Pasukonis, Timothy Lillicrap, Danijar Hafner
However, most benchmark tasks in reinforcement learning do not test long-term memory in agents, slowing down progress in this important research direction.
1 code implementation • 21 Oct 2022 • Arnav Kumar Jain, Shivakanth Sujit, Shruti Joshi, Vincent Michalski, Danijar Hafner, Samira Ebrahimi-Kahou
Learning world models from their sensory inputs enables agents to plan for actions by imagining their future outcomes.
1 code implementation • 5 Oct 2022 • Wilson Yan, Danijar Hafner, Stephen James, Pieter Abbeel
To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world.
no code implementations • 28 Jun 2022 • Younggyo Seo, Danijar Hafner, Hao liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel
Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects.
Model-based Reinforcement Learning
Reinforcement Learning (RL)
+1
1 code implementation • 28 Jun 2022 • Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, Pieter Abbeel
Learning a world model to predict the outcomes of potential actions enables planning in imagination, reducing the amount of trial and error needed in the real environment.
no code implementations • 8 Jun 2022 • Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel
Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.
no code implementations • NeurIPS 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine
We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
2 code implementations • NeurIPS 2021 • Russell Mendonca, Oleh Rybkin, Kostas Daniilidis, Danijar Hafner, Deepak Pathak
How can artificial agents learn to solve many diverse tasks in complex visual environments in the absence of any supervision?
1 code implementation • ICLR 2022 • Danijar Hafner
We hope that Crafter will accelerate research progress by quickly evaluating a wide spectrum of abilities.
no code implementations • ICML Workshop URL 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine
We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
no code implementations • ICML Workshop URL 2021 • Russell Mendonca, Oleh Rybkin, Kostas Daniilidis, Danijar Hafner, Deepak Pathak
How can an artificial agent learn to solve a wide range of tasks in a complex visual environment in the absence of external supervision?
2 code implementations • NeurIPS 2021 • Vaibhav Saxena, Jimmy Ba, Danijar Hafner
We introduce the Clockwork VAE (CW-VAE), a video prediction model that leverages a hierarchy of latent sequences, where higher levels tick at slower intervals.
no code implementations • 1 Jan 2021 • Vaibhav Saxena, Jimmy Ba, Danijar Hafner
Deep learning has shown promise for accurately predicting high-dimensional video sequences.
no code implementations • 1 Jan 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Dumitru Erhan, Harini Kannan, Chelsea Finn, Sergey Levine
In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.
Model-based Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 21 Dec 2020 • Brendon Matusch, Jimmy Ba, Danijar Hafner
Moreover, input entropy and information gain correlate more strongly with human similarity than task reward does, suggesting the use of intrinsic objectives for designing agents that behave similarly to human players.
1 code implementation • 8 Dec 2020 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan
In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.
Model-based Reinforcement Learning
Reinforcement Learning (RL)
no code implementations • ICLR 2021 • Kevin Xie, Homanga Bharadhwaj, Danijar Hafner, Animesh Garg, Florian Shkurti
To quickly solve new tasks in complex environments, intelligent agents need to build up reusable knowledge.
8 code implementations • ICLR 2021 • Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
The world model uses discrete representations and is trained separately from the policy.
Ranked #3 on
Atari Games
on Atari 2600 Skiing
(using extra training data)
1 code implementation • 3 Sep 2020 • Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess
While the narrow objectives correspond to domain-specific rewards as typical in reinforcement learning, the general objectives maximize information with the environment through latent variable models of input sequences.
no code implementations • 7 Jun 2020 • Karl Friston, Lancelot Da Costa, Danijar Hafner, Casper Hesp, Thomas Parr
In this paper, we consider a sophisticated kind of active inference, using a recursive form of expected free energy.
4 code implementations • 12 May 2020 • Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak
Reinforcement learning allows solving complex tasks, however, the learning tends to be task-specific and the sample efficiency remains a challenge.
18 code implementations • ICLR 2020 • Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Learned world models summarize an agent's experience to facilitate learning complex behaviors.
no code implementations • ICLR 2019 • Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson
NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training.
1 code implementation • NeurIPS 2019 • Dustin Tran, Michael W. Dusenberry, Mark van der Wilk, Danijar Hafner
We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty.
no code implementations • 30 Nov 2018 • Alexander Pashevich, Danijar Hafner, James Davidson, Rahul Sukthankar, Cordelia Schmid
To achieve this, we study different modulation signals and exploration for hierarchical controllers.
8 code implementations • 12 Nov 2018 • Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson
Planning has been very successful for control tasks with known environment dynamics.
Ranked #2 on
Continuous Control
on DeepMind Cheetah Run (Images)
2 code implementations • ICLR 2019 • Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson
NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training.
2 code implementations • NeurIPS 2018 • Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
Integrating model-free and model-based approaches in reinforcement learning has the potential to achieve the high performance of model-free algorithms with low sample complexity.
no code implementations • 27 Apr 2018 • Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, Vincent Vanhoucke
The control policies are learned in a physics simulator and then deployed on real robots.
no code implementations • 28 Nov 2017 • Danijar Hafner, Alexander Immer, Willi Raschkowski, Fabian Windheuser
Then, we capture a user's interest as a generative model in the space of the document representations.
2 code implementations • 8 Sep 2017 • Danijar Hafner, James Davidson, Vincent Vanhoucke
We introduce TensorFlow Agents, an efficient infrastructure paradigm for building parallel reinforcement learning algorithms in TensorFlow.
no code implementations • NeurIPS 2017 • Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess
We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus.
no code implementations • 7 Oct 2016 • Danijar Hafner
Using current reinforcement learning methods, it has recently become possible to learn to play unknown 3D games from raw pixels.