1 code implementation • 19 Jul 2023 • Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, Timothy Lillicrap
The dataset contains human demonstrations of device interactions, including the screens and actions, and corresponding natural language instructions.
4 code implementations • 10 Jan 2023 • Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap
General intelligence requires solving tasks across many domains.
no code implementations • 21 Nov 2022 • Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Jirka Lhotka, Timothy Lillicrap, Alistair Muldal, George Powell, Adam Santoro, Guy Scully, Sanjana Srivastava, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu
Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning.
1 code implementation • 24 Oct 2022 • Jurgis Pasukonis, Timothy Lillicrap, Danijar Hafner
However, most benchmark tasks in reinforcement learning do not test long-term memory in agents, slowing down progress in this important research direction.
no code implementations • 15 Oct 2022 • Anthony Zador, Sean Escola, Blake Richards, Bence Ölveczky, Yoshua Bengio, Kwabena Boahen, Matthew Botvinick, Dmitri Chklovskii, Anne Churchland, Claudia Clopath, James DiCarlo, Surya Ganguli, Jeff Hawkins, Konrad Koerding, Alexei Koulakov, Yann Lecun, Timothy Lillicrap, Adam Marblestone, Bruno Olshausen, Alexandre Pouget, Cristina Savin, Terrence Sejnowski, Eero Simoncelli, Sara Solla, David Sussillo, Andreas S. Tolias, Doris Tsao
Neuroscience has long been an essential driver of progress in artificial intelligence (AI).
no code implementations • 10 Jun 2022 • Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent SIfre, Théophane Weber, Timothy Lillicrap
Effective decision making involves flexibly relating past experiences and relevant contextual information to a novel situation.
no code implementations • 7 Jun 2022 • Chen Yan, Federico Carnevale, Petko Georgiev, Adam Santoro, Aurelia Guy, Alistair Muldal, Chia-Chun Hung, Josh Abramson, Timothy Lillicrap, Gregory Wayne
Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data.
no code implementations • 26 May 2022 • Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Timothy Lillicrap, Alistair Muldal, Blake Richards, Adam Santoro, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan
Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research.
no code implementations • 25 Feb 2022 • Sergey Bartunov, Fabian B. Fuchs, Timothy Lillicrap
Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation.
no code implementations • 17 Feb 2022 • Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent SIfre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell
In this paper we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior.
no code implementations • 16 Feb 2022 • Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap
It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks.
no code implementations • 7 Dec 2021 • DeepMind Interactive Agents Team, Josh Abramson, Arun Ahuja, Arthur Brussee, Federico Carnevale, Mary Cassin, Felix Fischer, Petko Georgiev, Alex Goldin, Mansi Gupta, Tim Harley, Felix Hill, Peter C Humphreys, Alden Hung, Jessica Landon, Timothy Lillicrap, Hamza Merzic, Alistair Muldal, Adam Santoro, Guy Scully, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu
A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language.
no code implementations • NeurIPS 2021 • Shahab Bakhtiari, Patrick Mineault, Timothy Lillicrap, Christopher Pack, Blake Richards
We show that when we train a deep neural network architecture with two parallel pathways using a self-supervised predictive loss function, we can outperform other models in fitting mouse visual cortex.
1 code implementation • 5 Feb 2021 • Adam Santoro, Andrew Lampinen, Kory Mathewson, Timothy Lillicrap, David Raposo
This approach will allow for AI to interpret something as symbolic on its own rather than simply manipulate things that are only symbols to human onlookers, and thus will ultimately lead to AI with more human-like symbolic fluency.
no code implementations • 10 Dec 2020 • Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne, Duncan Williams, Nathaniel Wong, Chen Yan, Rui Zhu
These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone.
no code implementations • NeurIPS 2020 • Basile Confavreux, Friedemann Zenke, Everton Agnes, Timothy Lillicrap, Tim Vogels
Instead of experimental data, the rules are constrained by the functions they implement and the structure they are meant to produce.
8 code implementations • ICLR 2021 • Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
The world model uses discrete representations and is trained separately from the policy.
Ranked #3 on
Atari Games
on Atari 2600 Skiing
(using extra training data)
no code implementations • 3 Oct 2020 • Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber
We explore whether integrated tasks like Mujoban can be solved by composing RL modules together in a sense-plan-act hierarchy, where modules have well-defined roles similarly to classic robot architectures.
1 code implementation • 11 Sep 2020 • Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Peter Karkus, Sébastien Racanière, Lars Buesing, Timothy Lillicrap, Nicolas Heess
To encourage progress towards this goal we introduce a set of physically embedded planning problems and make them publicly available.
2 code implementations • 22 Jun 2020 • Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Piotr Trochim, Si-Qi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, Nicolas Heess
The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation.
no code implementations • ICLR 2020 • Sebastien Racaniere, Andrew Lampinen, Adam Santoro, David Reichert, Vlad Firoiu, Timothy Lillicrap
We demonstrate the success of our approach in rich but sparsely rewarding 2D and 3D environments, where an agent is tasked to achieve a single goal selected from a set of possible goals that varies between episodes, and identify challenges for future work.
18 code implementations • ICLR 2020 • Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Learned world models summarize an agent's experience to facilitate learning complex behaviors.
1 code implementation • 2 Dec 2019 • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap
Training generative adversarial networks requires balancing of delicate adversarial dynamics.
18 code implementations • 19 Nov 2019 • Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver
When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.
Ranked #1 on
Atari Games
on Atari 2600 Alien
1 code implementation • 16 May 2019 • Yan Wu, Mihaela Rosca, Timothy Lillicrap
CS is flexible and data efficient, but its application has been restricted by the strong assumption of sparsity and costly reconstruction process.
no code implementations • ICLR 2019 • Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia
We introduce an approach for augmenting model-free deep reinforcement learning agents with a mechanism for relational reasoning over structured representations, which improves performance, learning efficiency, generalization, and interpretability.
no code implementations • ICLR 2019 • Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson
NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training.
no code implementations • 18 Apr 2019 • Adam Santoro, Felix Hill, David Barrett, David Raposo, Matthew Botvinick, Timothy Lillicrap
Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does.
3 code implementations • NeurIPS 2019 • Mohamed Akrout, Collin Wilson, Peter C. Humphreys, Timothy Lillicrap, Douglas Tweed
Current algorithms for deep learning probably cannot run in the brain because they rely on weight transport, where forward-path neurons transmit their synaptic weights to a feedback path, in a way that is likely impossible biologically.
2 code implementations • ICLR 2019 • Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap
Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data.
1 code implementation • ICLR 2019 • Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy Lillicrap
The field of reinforcement learning (RL) is facing increasingly challenging domains with combinatorial complexity.
1 code implementation • NeurIPS 2018 • Yan Wu, Greg Wayne, Karol Gregor, Timothy Lillicrap
Based on the idea of memory writing as inference, as proposed in the Kanerva Machine, we show that a likelihood-based Lyapunov function emerges from maximising the variational lower-bound of a generative memory.
8 code implementations • 12 Nov 2018 • Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson
Planning has been very successful for control tasks with known environment dynamics.
Ranked #2 on
Continuous Control
on DeepMind Cheetah Run (Images)
no code implementations • 15 Oct 2018 • Chia-Chun Hung, Timothy Lillicrap, Josh Abramson, Yan Wu, Mehdi Mirza, Federico Carnevale, Arun Ahuja, Greg Wayne
Humans spend a remarkable fraction of waking life engaged in acts of "mental time travel".
1 code implementation • ICLR 2019 • Nikolay Savinov, Anton Raichuk, Raphaël Marinier, Damien Vincent, Marc Pollefeys, Timothy Lillicrap, Sylvain Gelly
One solution to this problem is to allow the agent to create rewards for itself - thus making rewards dense and more suitable for learning.
2 code implementations • ICLR 2019 • Danijar Hafner, Dustin Tran, Timothy Lillicrap, Alex Irpan, James Davidson
NCPs are compatible with any model that can output uncertainty estimates, are easy to scale, and yield reliable uncertainty estimates throughout training.
1 code implementation • NeurIPS 2018 • Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap
Here we present results on scaling up biologically motivated models of deep learning on datasets which need deep networks with appropriate architectures to achieve good performance.
2 code implementations • ICML 2018 • David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap
To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-defined ways.
7 code implementations • 5 Jun 2018 • Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning.
2 code implementations • NeurIPS 2018 • Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap
Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods.
Ranked #59 on
Language Modelling
on WikiText-103
4 code implementations • ICLR 2018 • Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy Lillicrap
This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting.
no code implementations • ICLR 2018 • Yan Wu, Greg Wayne, Alex Graves, Timothy Lillicrap
We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them.
no code implementations • ICLR 2019 • Anirudh Goyal, Philemon Brakel, William Fedus, Soumye Singhal, Timothy Lillicrap, Sergey Levine, Hugo Larochelle, Yoshua Bengio
In many environments only a tiny subset of all states yield high reward.
1 code implementation • 28 Mar 2018 • Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap
Animals execute goal-directed behaviours despite the limited range and scope of their sensors.
7 code implementations • 2 Jan 2018 • Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, Timothy Lillicrap, Martin Riedmiller
The DeepMind Control Suite is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents.
58 code implementations • 5 Dec 2017 • David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent SIfre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis
The game of chess is the most widely-studied domain in the history of artificial intelligence.
Ranked #1 on
Game of Go
on ELO Ratings
10 code implementations • 16 Aug 2017 • Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain.
Ranked #1 on
Starcraft II
on MoveToBeacon
20 code implementations • NeurIPS 2017 • Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap
Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn.
Image Retrieval with Multi-Modal Query
Question Answering
+2
no code implementations • NeurIPS 2017 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques.
no code implementations • ICLR 2018 • Ivaylo Popov, Nicolas Heess, Timothy Lillicrap, Roland Hafner, Gabriel Barth-Maron, Matej Vecerik, Thomas Lampe, Yuval Tassa, Tom Erez, Martin Riedmiller
Solving this difficult and practically relevant problem in the real world is an important long-term goal for the field of robotics.
no code implementations • 16 Feb 2017 • David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, Peter Battaglia
We show that RNs are capable of learning object relations from scene description data.
no code implementations • 15 Feb 2017 • Mevlana Gemici, Chia-Chun Hung, Adam Santoro, Greg Wayne, Shakir Mohamed, Danilo J. Rezende, David Amos, Timothy Lillicrap
We consider the general problem of modeling temporal data with long-range dependencies, wherein new observations are fully or partially predictable based on temporally-distant, past observations.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
no code implementations • 17 Oct 2016 • Nicolas Heess, Greg Wayne, Yuval Tassa, Timothy Lillicrap, Martin Riedmiller, David Silver
We study a novel architecture and training procedure for locomotion tasks.
no code implementations • 3 Oct 2016 • Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine
In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
26 code implementations • NeurIPS 2016 • Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, Daan Wierstra
Our algorithm improves one-shot accuracy on ImageNet from 87. 6% to 93. 2% and from 88. 0% to 93. 8% on Omniglot compared to competing approaches.
11 code implementations • 19 May 2016 • Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap
Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning."
8 code implementations • 2 Mar 2016 • Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.
2 code implementations • 24 Dec 2015 • Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin
Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.
3 code implementations • NeurIPS 2015 • Nicolas Heess, Greg Wayne, David Silver, Timothy Lillicrap, Yuval Tassa, Tom Erez
One of these variants, SVG(1), shows the effectiveness of learning models, value functions, and policies simultaneously in continuous domains.