Search Results for author: Marc G. Bellemare

Found 55 papers, 28 papers with code

A Distributional Analogue to the Successor Representation

1 code implementation13 Feb 2024 Harley Wiltzer, Jesse Farebrother, Arthur Gretton, Yunhao Tang, André Barreto, Will Dabney, Marc G. Bellemare, Mark Rowland

This paper contributes a new approach for distributional reinforcement learning which elucidates a clean separation of transition structure and reward in the learning process.

Distributional Reinforcement Learning Model-based Reinforcement Learning +1

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation21 Nov 2023 Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

1 code implementation NeurIPS 2023 Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

To conclude, we develop a distribution-aware procedure which finds such paths, navigating away from noisy neighborhoods in order to improve the robustness of a policy.

Continuous Control

Bootstrapped Representations in Reinforcement Learning

no code implementations16 Jun 2023 Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

Auxiliary Learning reinforcement-learning +1

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

1 code implementation25 Apr 2023 Jesse Farebrother, Joshua Greaves, Rishabh Agarwal, Charline Le Lan, Ross Goroshin, Pablo Samuel Castro, Marc G. Bellemare

Combined with a suitable off-policy learning rule, the result is a representation learning algorithm that can be understood as extending Mahadevan & Maggioni (2007)'s proto-value functions to deep reinforcement learning -- accordingly, we call the resulting object proto-value networks.

Atari Games reinforcement-learning +1

An Analysis of Quantile Temporal-Difference Learning

no code implementations11 Jan 2023 Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning.

Distributional Reinforcement Learning reinforcement-learning +1

A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces

no code implementations8 Dec 2022 Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns.

Image Compression reinforcement-learning +1

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

1 code implementation3 Jun 2022 Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e. g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another.

Atari Games Humanoid Control +2

On the Generalization of Representations in Reinforcement Learning

1 code implementation1 Mar 2022 Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.

Atari Games reinforcement-learning +2

On Bonus-Based Exploration Methods in the Arcade Learning Environment

no code implementations22 Sep 2021 Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge

Deep Reinforcement Learning at the Edge of the Statistical Precipice

3 code implementations NeurIPS 2021 Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.

reinforcement-learning Reinforcement Learning (RL)

Metrics and continuity in reinforcement learning

2 code implementations2 Feb 2021 Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro

In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible.

reinforcement-learning Reinforcement Learning (RL)

The Importance of Pessimism in Fixed-Dataset Policy Optimization

1 code implementation ICLR 2021 Jacob Buckman, Carles Gelada, Marc G. Bellemare

To avoid this, algorithms can follow the pessimism principle, which states that we should choose the policy which acts optimally in the worst possible world.

Representations for Stable Off-Policy Reinforcement Learning

no code implementations ICML 2020 Dibya Ghosh, Marc G. Bellemare

Reinforcement learning with function approximation can be unstable and even divergent, especially when combined with off-policy learning and Bellman updates.

reinforcement-learning Reinforcement Learning (RL) +1

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

no code implementations27 Mar 2020 Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.

Q-Learning reinforcement-learning +1

Zooming for Efficient Model-Free Reinforcement Learning in Metric Spaces

no code implementations9 Mar 2020 Ahmed Touati, Adrien Ali Taiga, Marc G. Bellemare

Despite the wealth of research into provably efficient reinforcement learning algorithms, most works focus on tabular representation and thus struggle to handle exponentially or infinitely large state-action spaces.

reinforcement-learning Reinforcement Learning (RL)

On Catastrophic Interference in Atari 2600 Games

1 code implementation28 Feb 2020 William Fedus, Dibya Ghosh, John D. Martin, Marc G. Bellemare, Yoshua Bengio, Hugo Larochelle

Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.

Atari Games reinforcement-learning +1

On Bonus Based Exploration Methods In The Arcade Learning Environment

no code implementations ICLR 2020 Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction

no code implementations28 Nov 2019 Vishal Jain, William Fedus, Hugo Larochelle, Doina Precup, Marc G. Bellemare

Empirically, we find that these techniques improve the performance of a baseline deep reinforcement learning agent applied to text-based games.

reinforcement-learning Reinforcement Learning (RL) +1

Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment

no code implementations6 Aug 2019 Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).

Benchmarking Montezuma's Revenge

DeepMDP: Learning Continuous Latent Space Models for Representation Learning

no code implementations6 Jun 2019 Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare

We show that the optimization of these objectives guarantees (1) the quality of the latent space as a representation of the state space and (2) the quality of the DeepMDP as a model of the environment.

Reinforcement Learning (RL) Representation Learning

Statistics and Samples in Distributional Reinforcement Learning

no code implementations21 Feb 2019 Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney

We present a unifying framework for designing and analysing distributional reinforcement learning (DRL) algorithms in terms of recursively estimating statistics of the return distribution.

Distributional Reinforcement Learning reinforcement-learning +1

Distributional reinforcement learning with linear function approximation

no code implementations8 Feb 2019 Marc G. Bellemare, Nicolas Le Roux, Pablo Samuel Castro, Subhodeep Moitra

Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.

Distributional Reinforcement Learning reinforcement-learning +1

Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue

no code implementations31 Jan 2019 Kory W. Mathewson, Pablo Samuel Castro, Colin Cherry, George Foster, Marc G. Bellemare

We consider the problem of designing an artificial agent capable of interacting with humans in collaborative dialogue to produce creative, engaging narratives.

Specificity

The Value Function Polytope in Reinforcement Learning

no code implementations31 Jan 2019 Robert Dadashi, Adrien Ali Taïga, Nicolas Le Roux, Dale Schuurmans, Marc G. Bellemare

We establish geometric and topological properties of the space of value functions in finite state-action Markov decision processes.

reinforcement-learning Reinforcement Learning (RL)

A Comparative Analysis of Expected and Distributional Reinforcement Learning

no code implementations30 Jan 2019 Clare Lyle, Pablo Samuel Castro, Marc G. Bellemare

Since their introduction a year ago, distributional approaches to reinforcement learning (distributional RL) have produced strong results relative to the standard approach which models expected values (expected RL).

Distributional Reinforcement Learning reinforcement-learning +1

Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift

no code implementations27 Jan 2019 Carles Gelada, Marc G. Bellemare

We complement our analysis with an empirical evaluation of the two techniques in an off-policy setting on the game Pong from the Atari domain where we find discounted COP-TD to be better behaved in practice than the soft normalization penalty.

reinforcement-learning Reinforcement Learning (RL)

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

1 code implementation17 Dec 2018 Felipe Petroski Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Jiale Zhi, Ludwig Schubert, Marc G. Bellemare, Jeff Clune, Joel Lehman

We lessen this friction, by (1) training several algorithms at scale and releasing trained models, (2) integrating with a previous Deep RL model release, and (3) releasing code that makes it easy for anyone to load, visualize, and analyze such models.

Atari Games Friction +2

Approximate Exploration through State Abstraction

no code implementations29 Aug 2018 Adrien Ali Taïga, Aaron Courville, Marc G. Bellemare

Next, we show how a given density model can be related to an abstraction and that the corresponding pseudo-count bonus can act as a substitute in MBIE-EB combined with this abstraction, but may lead to either under- or over-exploration.

Count-Based Exploration with the Successor Representation

2 code implementations ICLR 2019 Marlos C. Machado, Marc G. Bellemare, Michael Bowling

In this paper we introduce a simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required.

Atari Games Efficient Exploration +1

An Analysis of Categorical Distributional Reinforcement Learning

no code implementations22 Feb 2018 Mark Rowland, Marc G. Bellemare, Will Dabney, Rémi Munos, Yee Whye Teh

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance.

Distributional Reinforcement Learning reinforcement-learning +1

Distributional Reinforcement Learning with Quantile Regression

17 code implementations27 Oct 2017 Will Dabney, Mark Rowland, Marc G. Bellemare, Rémi Munos

In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean.

Atari Games Distributional Reinforcement Learning +3

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

7 code implementations18 Sep 2017 Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games.

Atari Games

A Distributional Perspective on Reinforcement Learning

22 code implementations ICML 2017 Marc G. Bellemare, Will Dabney, Rémi Munos

We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning.

Atari Games reinforcement-learning +1

Automated Curriculum Learning for Neural Networks

no code implementations ICML 2017 Alex Graves, Marc G. Bellemare, Jacob Menick, Remi Munos, Koray Kavukcuoglu

We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency.

Count-Based Exploration with Neural Density Models

1 code implementation ICML 2017 Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos

This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge.

Montezuma's Revenge

Q($λ$) with Off-Policy Corrections

no code implementations16 Feb 2016 Anna Harutyunyan, Marc G. Bellemare, Tom Stepleton, Remi Munos

We propose and analyze an alternate approach to off-policy multi-step temporal difference learning, in which off-policy returns are corrected with the current Q-function in terms of rewards, rather than with the target policy in terms of transition probabilities.

Increasing the Action Gap: New Operators for Reinforcement Learning

2 code implementations15 Dec 2015 Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos

Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator.

Atari Games Q-Learning +2

Human level control through deep reinforcement learning

7 code implementations25 Feb 2015 Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis

We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.

Atari Games reinforcement-learning +1

Compress and Control

no code implementations19 Nov 2014 Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning.

Reinforcement Learning (RL)

The Arcade Learning Environment: An Evaluation Platform for General Agents

3 code implementations19 Jul 2012 Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling

We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning.

Atari Games Benchmarking +4

Cannot find the paper you are looking for? You can Submit a new open access paper.