Search Results for author: Matteo Hessel

Found 28 papers, 10 papers with code

Self-Consistent Models and Values

no code implementations NeurIPS 2021 Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment.

Emphatic Algorithms for Deep Reinforcement Learning

no code implementations21 Jun 2021 Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

In this paper, we extend the use of emphatic methods to deep reinforcement learning agents.

Atari Games

Podracer architectures for scalable Reinforcement Learning

no code implementations13 Apr 2021 Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt

Supporting state-of-the-art AI research requires balancing rapid prototyping, ease of use, and quick iteration, with the ability to deploy experiments at a scale traditionally associated with production systems. Deep learning frameworks such as TensorFlow, PyTorch and JAX allow users to transparently make use of accelerators, such as TPUs and GPUs, to offload the more computationally intensive parts of training and inference in modern deep learning systems.

Discovery of Options via Meta-Learned Subgoals

no code implementations NeurIPS 2021 Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.

Discovering Reinforcement Learning Algorithms

1 code implementation NeurIPS 2020 Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Automating the discovery of update rules from data could lead to more efficient algorithms, or algorithms that are better adapted to specific environments.

Atari Games Meta-Learning

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

no code implementations NeurIPS 2020 Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver

In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment.

Q-Learning

Expected Eligibility Traces

no code implementations3 Jul 2020 Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence.

A Self-Tuning Actor-Critic Algorithm

no code implementations NeurIPS 2020 Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.

Atari Games

What Can Learned Intrinsic Rewards Capture?

no code implementations ICML 2020 Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Furthermore, we show that unlike policy transfer methods that capture "how" the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing "what" the agent should strive to do.

Off-Policy Actor-Critic with Shared Experience Replay

no code implementations ICML 2020 Simon Schmitt, Matteo Hessel, Karen Simonyan

We investigate the combination of actor-critic reinforcement learning algorithms with uniform large-scale experience replay and propose solutions for two challenges: (a) efficient actor-critic learning with experience replay (b) stability of off-policy learning where agents learn from other agents behaviour.

Atari Games

Discovery of Useful Questions as Auxiliary Tasks

no code implementations NeurIPS 2019 Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Behaviour Suite for Reinforcement Learning

2 code implementations ICLR 2020 Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado van Hasselt

bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives.

On Inductive Biases in Deep Reinforcement Learning

no code implementations5 Jul 2019 Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver

These inductive biases can take many forms, including domain knowledge and pretuned hyper-parameters.

Continuous Control

When to use parametric models in reinforcement learning?

2 code implementations NeurIPS 2019 Hado van Hasselt, Matteo Hessel, John Aslanides

We examine the question of when and how parametric models are most useful in reinforcement learning.

Towards Consistent Performance on Atari using Expert Demonstrations

no code implementations ICLR 2019 Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Despite significant advances in the field of deep Reinforcement Learning (RL), today's algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games.

Atari Games

Scaling shared model governance via model splitting

no code implementations ICLR 2019 Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation.

Deep Reinforcement Learning and the Deadly Triad

no code implementations6 Dec 2018 Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory

Multi-task Deep Reinforcement Learning with PopArt

1 code implementation12 Sep 2018 Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt

This means the learning algorithm is general, but each solution is not; each agent can only solve the one task it was trained on.

Atari Games Multi-Task Learning

Observe and Look Further: Achieving Consistent Performance on Atari

1 code implementation29 May 2018 Tobias Pohlen, Bilal Piot, Todd Hester, Mohammad Gheshlaghi Azar, Dan Horgan, David Budden, Gabriel Barth-Maron, Hado van Hasselt, John Quan, Mel Večerík, Matteo Hessel, Rémi Munos, Olivier Pietquin

Despite significant advances in the field of deep Reinforcement Learning (RL), today's algorithms still fail to learn human-level policies consistently over a set of diverse tasks such as Atari 2600 games.

Montezuma's Revenge

Distributed Prioritized Experience Replay

14 code implementations ICLR 2018 Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible.

Atari Games

Dueling Network Architectures for Deep Reinforcement Learning

65 code implementations20 Nov 2015 Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas

In recent years there have been many successes of using deep representations in reinforcement learning.

Atari Games

Cannot find the paper you are looking for? You can Submit a new open access paper.