Search Results for author: Hado van Hasselt

Found 52 papers, 19 papers with code

Disentangling the Causes of Plasticity Loss in Neural Networks

no code implementations • 29 Feb 2024 • Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney

Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution.

Atari Games reinforcement-learning

Paper
Add Code

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

no code implementations • 2 Dec 2023 • Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Thomas Mesnard, Hado van Hasselt, Laura Toni

In this survey, we review the state of the art of Temporal Credit Assignment (CA) in deep RL.

Decision Making reinforcement-learning +1

Paper
Add Code

A Definition of Continual Reinforcement Learning

no code implementations • NeurIPS 2023 • David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh

Using this new language, we define a continual learning agent as one that can be understood as carrying out an implicit search process indefinitely, and continual reinforcement learning as the setting in which the best agents are all continual learning agents.

Continual Learning reinforcement-learning

Paper
Add Code

On the Convergence of Bounded Agents

no code implementations • 20 Jul 2023 • David Abel, André Barreto, Hado van Hasselt, Benjamin Van Roy, Doina Precup, Satinder Singh

Standard models of the reinforcement learning problem give rise to a straightforward definition of convergence: An agent converges when its behavior or performance in each environment state stops changing.

reinforcement-learning

Paper
Add Code

Exploration via Epistemic Value Estimation

no code implementations • 7 Mar 2023 • Simon Schmitt, John Shawe-Taylor, Hado van Hasselt

We propose epistemic value estimation (EVE): a recipe that is compatible with sequential decision making and with neural network function approximators.

Decision Making Efficient Exploration +1

Paper
Add Code

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

no code implementations • 8 Feb 2023 • Chentian Jiang, Nan Rosemary Ke, Hado van Hasselt

To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks.

Bayesian Inference Thompson Sampling

Paper
Add Code

Human-level Atari 200x faster

1 code implementation • 15 Sep 2022 • Steven Kapturowski, Víctor Campos, Ray Jiang, Nemanja Rakićević, Hado van Hasselt, Charles Blundell, Adrià Puigdomènech Badia

The task of building general agents that perform well over a wide range of tasks has been an importantgoal in reinforcement learning since its inception.

Paper
Code

Selective Credit Assignment

no code implementations • 20 Feb 2022 • Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt

Efficient credit assignment is essential for reinforcement learning algorithms in both prediction and control settings.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Chaining Value Functions for Off-Policy Learning

no code implementations • 17 Jan 2022 • Simon Schmitt, John Shawe-Taylor, Hado van Hasselt

To accumulate knowledge and improve its policy of behaviour, a reinforcement learning agent can learn `off-policy' about policies that differ from the policy used to generate its experience.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Self-Consistent Models and Values

no code implementations • NeurIPS 2021 • Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity

no code implementations • 8 Oct 2021 • Marta Garnelo, Wojciech Marian Czarnecki, SiQi Liu, Dhruva Tirumala, Junhyuk Oh, Gauthier Gidel, Hado van Hasselt, David Balduzzi

Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance.

Paper
Add Code

Learning by Directional Gradient Descent

no code implementations • ICLR 2022 • David Silver, Anirudh Goyal, Ivo Danihelka, Matteo Hessel, Hado van Hasselt

How should state be constructed from a sequence of observations, so as to best achieve some objective?

Paper
Add Code

Introducing Symmetries to Black Box Meta Reinforcement Learning

no code implementations • 22 Sep 2021 • Louis Kirsch, Sebastian Flennerhag, Hado van Hasselt, Abram Friesen, Junhyuk Oh, Yutian Chen

We show that a recent successful meta RL approach that meta-learns an objective for backpropagation-based learning exhibits certain symmetries (specifically the reuse of the learning rule, and invariance to input and output permutations) that are not present in typical black-box meta RL systems.

Meta-Learning Meta Reinforcement Learning +2

Paper
Add Code

Bootstrapped Meta-Learning

1 code implementation • ICLR 2022 • Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh

We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark and demonstrate that it yields both performance and efficiency gains in multi-task meta-learning.

Efficient Exploration Few-Shot Learning +1

Paper
Code

Learning Expected Emphatic Traces for Deep RL

no code implementations • 12 Jul 2021 • Ray Jiang, Shangtong Zhang, Veronica Chelu, Adam White, Hado van Hasselt

We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed $n$-step TD learning algorithm to learn the required emphatic weighting.

Paper
Add Code

Emphatic Algorithms for Deep Reinforcement Learning

no code implementations • 21 Jun 2021 • Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

In this paper, we extend the use of emphatic methods to deep reinforcement learning agents.

Atari Games reinforcement-learning +1

Paper
Add Code

Muesli: Combining Improvements in Policy Optimization

2 code implementations • 13 Apr 2021 • Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent SIfre, Theophane Weber, David Silver, Hado van Hasselt

We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss.

Ranked #8 on Atari Games on atari game

Atari Games Continuous Control

Paper
Code

Podracer architectures for scalable Reinforcement Learning

3 code implementations • 13 Apr 2021 • Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt

Supporting state-of-the-art AI research requires balancing rapid prototyping, ease of use, and quick iteration, with the ability to deploy experiments at a scale traditionally associated with production systems. Deep learning frameworks such as TensorFlow, PyTorch and JAX allow users to transparently make use of accelerators, such as TPUs and GPUs, to offload the more computationally intensive parts of training and inference in modern deep learning systems.

reinforcement-learning Reinforcement Learning (RL)

519

Paper
Code

Synthetic Returns for Long-Term Credit Assignment

2 code implementations • 24 Feb 2021 • David Raposo, Sam Ritter, Adam Santoro, Greg Wayne, Theophane Weber, Matt Botvinick, Hado van Hasselt, Francis Song

We propose state-associative (SA) learning, where the agent learns associations between states and arbitrarily distant future rewards, then propagates credit directly between the two.

3,521

Paper
Code

Discovery of Options via Meta-Learned Subgoals

no code implementations • NeurIPS 2021 • Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.

Reinforcement Learning (RL)

Paper
Add Code

Forethought and Hindsight in Credit Assignment

no code implementations • NeurIPS 2020 • Veronica Chelu, Doina Precup, Hado van Hasselt

We address the problem of credit assignment in reinforcement learning and explore fundamental questions regarding the way in which an agent can best use additional computation to propagate new information, by planning with internal models of the world to improve its predictions.

Reinforcement Learning (RL)

Paper
Add Code

Discovering Reinforcement Learning Algorithms

1 code implementation • NeurIPS 2020 • Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments.

Atari Games Meta-Learning +3

Paper
Code

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

no code implementations • NeurIPS 2020 • Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver

In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment.

Q-Learning reinforcement-learning +1

Paper
Add Code

Expected Eligibility Traces

no code implementations • 3 Jul 2020 • Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence.

counterfactual

Paper
Add Code

A Self-Tuning Actor-Critic Algorithm

no code implementations • NeurIPS 2020 • Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.

Atari Games reinforcement-learning +1

Paper
Add Code

What Can Learned Intrinsic Rewards Capture?

no code implementations • ICML 2020 • Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do.

Paper
Add Code

Hindsight Credit Assignment

1 code implementation • NeurIPS 2019 • Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Remi Munos

We consider the problem of efficient credit assignment in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Conditional Importance Sampling for Off-Policy Learning

no code implementations • 16 Oct 2019 • Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos, Will Dabney

We theoretically analyse this space, and concretely investigate several algorithms that arise from this framework.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Discovery of Useful Questions as Auxiliary Tasks

no code implementations • NeurIPS 2019 • Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Reinforcement Learning (RL)

Paper
Add Code

Behaviour Suite for Reinforcement Learning

2 code implementations • ICLR 2020 • Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado van Hasselt

bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives.

reinforcement-learning Reinforcement Learning (RL)

1,466

Paper
Code

General non-linear Bellman equations

no code implementations • 8 Jul 2019 • Hado van Hasselt, John Quan, Matteo Hessel, Zhongwen Xu, Diana Borsa, Andre Barreto

We consider a general class of non-linear Bellman equations.

Paper
Add Code

On Inductive Biases in Deep Reinforcement Learning

no code implementations • 5 Jul 2019 • Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver

These inductive biases can take many forms, including domain knowledge and pretuned hyper-parameters.

Continuous Control reinforcement-learning +1

Paper
Add Code

When to use parametric models in reinforcement learning?

2 code implementations • NeurIPS 2019 • Hado van Hasselt, Matteo Hessel, John Aslanides

We examine the question of when and how parametric models are most useful in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

1,526

Paper
Code

Meta-learning of Sequential Strategies

no code implementations • 8 May 2019 • Pedro A. Ortega, Jane. X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class.

Meta-Learning

Paper
Add Code

Towards Consistent Performance on Atari using Expert Demonstrations

no code implementations • ICLR 2019 • Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Despite significant advances in the field of deep Reinforcement Learning (RL), today's algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games.

Atari Games Reinforcement Learning (RL)

Paper
Add Code

Universal Successor Features Approximators

1 code implementation • ICLR 2019 • Diana Borsa, André Barreto, John Quan, Daniel Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul

We focus on one aspect in particular, namely the ability to generalise to unseen tasks.

Navigate Reinforcement Learning (RL)

Paper
Code

Deep Reinforcement Learning and the Deadly Triad

no code implementations • 6 Dec 2018 • Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory reinforcement-learning +1

Paper
Add Code

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations • 16 Nov 2018 • Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Paper
Add Code

Multi-task Deep Reinforcement Learning with PopArt

2 code implementations • 12 Sep 2018 • Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt

This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on.

Ranked #1 on Visual Navigation on Dmlab-30

Atari Games Multi-Task Learning +2

2,513

Paper
Code

Observe and Look Further: Achieving Consistent Performance on Atari

no code implementations • 29 May 2018 • Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Montezuma's Revenge Reinforcement Learning (RL)

Paper
Add Code

Meta-Gradient Reinforcement Learning

1 code implementation • NeurIPS 2018 • Zhongwen Xu, Hado van Hasselt, David Silver

Instead, the majority of reinforcement learning algorithms estimate and/or optimise a proxy for the value function.

Meta-Learning reinforcement-learning +1

Paper
Code

Distributed Prioritized Experience Replay

15 code implementations • ICLR 2018 • Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible.

Ranked #1 on Atari Games on Atari 2600 Boxing

Atari Games reinforcement-learning +1

30,980

Paper
Code

Unicorn: Continual Learning with a Universal, Off-policy Agent

no code implementations • 22 Feb 2018 • Daniel J. Mankowitz, Augustin Žídek, André Barreto, Dan Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver, Tom Schaul

Some real-world domains are best characterized as a single task, but for others this perspective is limiting.

Continual Learning

Paper
Add Code

Rainbow: Combining Improvements in Deep Reinforcement Learning

32 code implementations • 6 Oct 2017 • Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver

The deep reinforcement learning community has made several independent improvements to the DQN algorithm.

Ranked #9 on Atari Games on atari game

Montezuma's Revenge reinforcement-learning +1

7,369

Paper
Code

StarCraft II: A New Challenge for Reinforcement Learning

11 code implementations • 16 Aug 2017 • Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing

Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain.

Ranked #1 on Starcraft II on MoveToBeacon

reinforcement-learning Reinforcement Learning (RL) +2

7,907

Paper
Code

The Predictron: End-To-End Learning and Planning

1 code implementation • ICML 2017 • David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris

One of the key challenges of artificial intelligence is to learn models that are effective in the context of planning.

290

Paper
Code

Successor Features for Transfer in Reinforcement Learning

no code implementations • NeurIPS 2017 • André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver

Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning values across many orders of magnitude

no code implementations • NeurIPS 2016 • Hado van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver

Most learning algorithms are not invariant to the scale of the function that is being approximated.

Ranked #12 on Atari Games on Atari 2600 Centipede

Atari Games reinforcement-learning +1

Paper
Add Code

Deep Reinforcement Learning in Large Discrete Action Spaces

2 code implementations • 24 Dec 2015 • Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.

Recommendation Systems reinforcement-learning +1

Paper
Code

Dueling Network Architectures for Deep Reinforcement Learning

73 code implementations • 20 Nov 2015 • Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas

In recent years there have been many successes of using deep representations in reinforcement learning.

Ranked #1 on Atari Games on Atari 2600 Pong

Atari Games reinforcement-learning +1

47,594

Paper
Code

Deep Reinforcement Learning with Double Q-learning

97 code implementations • 22 Sep 2015 • Hado van Hasselt, Arthur Guez, David Silver

The popular Q-learning algorithm is known to overestimate action values under certain conditions.

Ranked #10 on Atari Games on Atari 2600 Gopher

Atari Games Q-Learning +1

47,594

Paper
Code

Learning to Predict Independent of Span

no code implementations • 19 Aug 2015 • Hado van Hasselt, Richard S. Sutton

If predictions are made at a high rate or span over a large amount of time, substantial computation can be required to store all relevant observations and to update all predictions when the outcome is finally observed.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.