Search Results for author: Matteo Hessel

Found 29 papers, 10 papers with code

Self-Consistent Models and Values

no code implementations • NeurIPS 2021 • Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning by Directional Gradient Descent

no code implementations • ICLR 2022 • David Silver, Anirudh Goyal, Ivo Danihelka, Matteo Hessel, Hado van Hasselt

How should state be constructed from a sequence of observations, so as to best achieve some objective?

Paper
Add Code

Emphatic Algorithms for Deep Reinforcement Learning

no code implementations • 21 Jun 2021 • Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

In this paper, we extend the use of emphatic methods to deep reinforcement learning agents.

Atari Games reinforcement-learning +1

Paper
Add Code

Muesli: Combining Improvements in Policy Optimization

2 code implementations • 13 Apr 2021 • Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent SIfre, Theophane Weber, David Silver, Hado van Hasselt

We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss.

Ranked #8 on Atari Games on atari game

Atari Games Continuous Control

Paper
Code

Podracer architectures for scalable Reinforcement Learning

3 code implementations • 13 Apr 2021 • Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt

Supporting state-of-the-art AI research requires balancing rapid prototyping, ease of use, and quick iteration, with the ability to deploy experiments at a scale traditionally associated with production systems. Deep learning frameworks such as TensorFlow, PyTorch and JAX allow users to transparently make use of accelerators, such as TPUs and GPUs, to offload the more computationally intensive parts of training and inference in modern deep learning systems.

reinforcement-learning Reinforcement Learning (RL)

536

Paper
Code

Discovery of Options via Meta-Learned Subgoals

no code implementations • NeurIPS 2021 • Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.

Reinforcement Learning (RL)

Paper
Add Code

Discovering Reinforcement Learning Algorithms

2 code implementations • NeurIPS 2020 • Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments.

Atari Games Meta-Learning +3

Paper
Code

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

no code implementations • NeurIPS 2020 • Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver

In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment.

Q-Learning reinforcement-learning +1

Paper
Add Code

Expected Eligibility Traces

no code implementations • 3 Jul 2020 • Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence.

counterfactual

Paper
Add Code

A Self-Tuning Actor-Critic Algorithm

no code implementations • NeurIPS 2020 • Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.

Atari Games reinforcement-learning +1

Paper
Add Code

What Can Learned Intrinsic Rewards Capture?

no code implementations • ICML 2020 • Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do.

Paper
Add Code

Off-Policy Actor-Critic with Shared Experience Replay

no code implementations • ICML 2020 • Simon Schmitt, Matteo Hessel, Karen Simonyan

We investigate the combination of actor-critic reinforcement learning algorithms with uniform large-scale experience replay and propose solutions for two challenges: (a) efficient actor-critic learning with experience replay (b) stability of off-policy learning where agents learn from other agents behaviour.

Ranked #6 on Atari Games on Atari-57

Atari Games

Paper
Add Code

Discovery of Useful Questions as Auxiliary Tasks

no code implementations • NeurIPS 2019 • Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Reinforcement Learning (RL)

Paper
Add Code

Behaviour Suite for Reinforcement Learning

2 code implementations • ICLR 2020 • Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado van Hasselt

bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives.

reinforcement-learning Reinforcement Learning (RL)

1,471

Paper
Code

General non-linear Bellman equations

no code implementations • 8 Jul 2019 • Hado van Hasselt, John Quan, Matteo Hessel, Zhongwen Xu, Diana Borsa, Andre Barreto

We consider a general class of non-linear Bellman equations.

Paper
Add Code

On Inductive Biases in Deep Reinforcement Learning

no code implementations • 5 Jul 2019 • Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver

These inductive biases can take many forms, including domain knowledge and pretuned hyper-parameters.

Continuous Control reinforcement-learning +1

Paper
Add Code

When to use parametric models in reinforcement learning?

2 code implementations • NeurIPS 2019 • Hado van Hasselt, Matteo Hessel, John Aslanides

We examine the question of when and how parametric models are most useful in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

1,535

Paper
Code

Towards Consistent Performance on Atari using Expert Demonstrations

no code implementations • ICLR 2019 • Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Despite significant advances in the field of deep Reinforcement Learning (RL), today's algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games.

Atari Games Reinforcement Learning (RL)

Paper
Add Code

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement

no code implementations • ICML 2018 • André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Žídek, Rémi Munos

In this paper we extend the SFs & GPI framework in two ways.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Scaling shared model governance via model splitting

no code implementations • ICLR 2019 • Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Deep Reinforcement Learning and the Deadly Triad

no code implementations • 6 Dec 2018 • Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory reinforcement-learning +1

Paper
Add Code

Multi-task Deep Reinforcement Learning with PopArt

2 code implementations • 12 Sep 2018 • Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt

This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on.

Ranked #1 on Visual Navigation on Dmlab-30

Atari Games Multi-Task Learning +2

2,628

Paper
Code

Observe and Look Further: Achieving Consistent Performance on Atari

no code implementations • 29 May 2018 • Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Montezuma's Revenge Reinforcement Learning (RL)

Paper
Add Code

Distributed Prioritized Experience Replay

15 code implementations • ICLR 2018 • Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible.

Ranked #1 on Atari Games on Atari 2600 Boxing