Search Results for author: Pablo Samuel Castro

Found 36 papers, 21 papers with code

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

no code implementations • 6 Mar 2024 • Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions.

Atari Games regression +1

Paper
Add Code

In deep reinforcement learning, a pruned network is a good network

no code implementations • 19 Feb 2024 • Johan Obando-Ceron, Aaron Courville, Pablo Samuel Castro

Recent work has shown that deep reinforcement learning agents have difficulty in effectively using their network parameters.

reinforcement-learning

Paper
Add Code

Mixtures of Experts Unlock Parameter Scaling for Deep RL

no code implementations • 13 Feb 2024 • Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, Pablo Samuel Castro

The recent rapid progress in (self) supervised learning models is in large part predicted by empirical scaling laws: a model's performance scales proportionally to its size.

reinforcement-learning Self-Supervised Learning

Paper
Add Code

A density estimation perspective on learning from pairwise human preferences

1 code implementation • 23 Nov 2023 • Vincent Dumoulin, Daniel D. Johnson, Pablo Samuel Castro, Hugo Larochelle, Yann Dauphin

Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research.

Density Estimation

Paper
Code

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation • 21 Nov 2023 • Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Paper
Code

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

no code implementations • 5 Oct 2023 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning.

reinforcement-learning

Paper
Add Code

Offline Reinforcement Learning with On-Policy Q-Function Regularization

no code implementations • 25 Jul 2023 • Laixi Shi, Robert Dadashi, Yuejie Chi, Pablo Samuel Castro, Matthieu Geist

In this work, we propose to regularize towards the Q-function of the behavior policy instead of the behavior policy itself, under the premise that the Q-function can be estimated more reliably and easily by a SARSA-style estimate and handles the extrapolation error more straightforwardly.

D4RL reinforcement-learning +1

Paper
Add Code

Bigger, Better, Faster: Human-level Atari with human-level efficiency

3 code implementations • 30 May 2023 • Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark.

Ranked #1 on Atari Games 100k on Atari 100k

Atari Games 100k

32,758

Paper
Code

JaxPruner: A concise library for sparsity research

1 code implementation • 27 Apr 2023 • Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research.

196

Paper
Code

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

1 code implementation • 25 Apr 2023 • Jesse Farebrother, Joshua Greaves, Rishabh Agarwal, Charline Le Lan, Ross Goroshin, Pablo Samuel Castro, Marc G. Bellemare

Combined with a suitable off-policy learning rule, the result is a representation learning algorithm that can be understood as extending Mahadevan & Maggioni (2007)'s proto-value functions to deep reinforcement learning -- accordingly, we call the resulting object proto-value networks.

Atari Games reinforcement-learning +1

32,758

Paper
Code

The Dormant Neuron Phenomenon in Deep Reinforcement Learning

1 code implementation • 24 Feb 2023 • Ghada Sokar, Rishabh Agarwal, Pablo Samuel Castro, Utku Evci

In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing number of inactive neurons, thereby affecting network expressivity.

reinforcement-learning Reinforcement Learning (RL)

10,367

Paper
Code

The State of Sparse Training in Deep Reinforcement Learning

1 code implementation • 17 Jun 2022 • Laura Graesser, Utku Evci, Erich Elsen, Pablo Samuel Castro

The use of sparse neural networks has seen rapid growth in recent years, particularly in computer vision.

reinforcement-learning Reinforcement Learning (RL)

314

Paper
Code

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

1 code implementation • 3 Jun 2022 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e. g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another.

Atari Games Humanoid Control +2

Paper
Code

Losses, Dissonances, and Distortions

no code implementations • 8 Nov 2021 • Pablo Samuel Castro

In this paper I present a study in using the losses and gradients obtained during the training of a simple function approximator as a mechanism for creating musical dissonance and visual distortion in a solo piano performance setting.

Paper
Add Code

The Difficulty of Passive Learning in Deep Reinforcement Learning

1 code implementation • NeurIPS 2021 • Georg Ostrovski, Pablo Samuel Castro, Will Dabney

Learning to act from observational data without active environmental interaction is a well-known challenge in Reinforcement Learning (RL).

reinforcement-learning Reinforcement Learning (RL)

12,780

Paper
Code

Lifting the veil on hyper-parameters for value-based deep reinforcement learning

no code implementations • NeurIPS Workshop LatinX_in_AI 2021 • João Guilherme Madeira Araújo, Johan Samir Obando Ceron, Pablo Samuel Castro

Successful applications of deep reinforcement learning (deep RL) combine algorithmic design and careful hyper-parameter selection.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Composing Features: Compositional Model Augmentation for Steerability of Music Transformers

no code implementations • 29 Sep 2021 • Halley Young, Vincent Dumoulin, Pablo Samuel Castro, Jesse Engel, Cheng-Zhi Anna Huang

To tackle the combinatorial nature of composing features, we propose a compositional approach to steering music transformers, building on lightweight fine-tuning methods such as prefix tuning and bias tuning.

Paper
Add Code

Deep Reinforcement Learning at the Edge of the Statistical Precipice

3 code implementations • NeurIPS 2021 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.

reinforcement-learning Reinforcement Learning (RL)

695

Paper
Code

A general class of surrogate functions for stable and efficient reinforcement learning

1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux

Common policy gradient methods rely on the maximization of a sequence of surrogate functions.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

MICo: Improved representations via sampling-based state similarity for Markov decision processes

2 code implementations • NeurIPS 2021 • Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents.

Atari Games reinforcement-learning +1

32,753

Paper
Code

LOCO: Adaptive exploration in reinforcement learning via local estimation of contraction coefficients

no code implementations • ICLR Workshop SSL-RL 2021 • Manfred Diaz, Liam Paull, Pablo Samuel Castro

We offer a novel approach to balance exploration and exploitation in reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Metrics and continuity in reinforcement learning

2 code implementations • 2 Feb 2021 • Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro

In most practical applications of reinforcement learning, it is untenable to maintain direct estimates for individual states; in continuous-state systems, it is impossible.

reinforcement-learning Reinforcement Learning (RL)

32,753

Paper
Code

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

1 code implementation • ICLR 2021 • Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare

Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states.

reinforcement-learning Reinforcement Learning (RL) +1

32,753

Paper
Code

Revisiting Rainbow: Promoting more Insightful and Inclusive Deep Reinforcement Learning Research

2 code implementations • 20 Nov 2020 • Johan S. Obando-Ceron, Pablo Samuel Castro

Since the introduction of DQN, a vast majority of reinforcement learning research has focused on reinforcement learning with deep neural networks as function approximators.

Atari Games reinforcement-learning +1

Paper
Code

GANterpretations

1 code implementation • 6 Nov 2020 • Pablo Samuel Castro

Since the introduction of Generative Adversarial Networks (GANs) [Goodfellow et al., 2014] there has been a regular stream of both technical advances (e. g., Arjovsky et al. [2017]) and creative uses of these generative models (e. g., [Karras et al., 2019, Zhu et al., 2017, Jin et al., 2017]).

Paper
Code

Rigging the Lottery: Making All Tickets Winners

10 code implementations • ICML 2020 • Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, Erich Elsen

There is a large body of work on training dense networks to yield sparse networks for inference, but this limits the size of the largest trainable sparse model to that of the largest trainable dense model.

Ranked #1 on Sparse Learning on ImageNet

Image Classification Language Modelling +1

314

Paper
Code

Scalable methods for computing state similarity in deterministic Markov Decision Processes

1 code implementation • 21 Nov 2019 • Pablo Samuel Castro

We present new algorithms for computing and approximating bisimulation metrics in Markov Decision Processes (MDPs).

32,755

Paper
Code

Inverse Reinforcement Learning with Multiple Ranked Experts

no code implementations • 31 Jul 2019 • Pablo Samuel Castro, Shijian Li, Daqing Zhang

We consider the problem of learning to behave optimally in a Markov Decision Process when a reward function is not specified, but instead we have access to a set of demonstrators of varying performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Performing Structured Improvisations with pre-trained Deep Learning Models

1 code implementation • 30 Apr 2019 • Pablo Samuel Castro

The quality of outputs produced by deep generative models for music have seen a dramatic improvement in the last few years.

Paper
Code

Distributional reinforcement learning with linear function approximation

no code implementations • 8 Feb 2019 • Marc G. Bellemare, Nicolas Le Roux, Pablo Samuel Castro, Subhodeep Moitra

Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue

no code implementations • 31 Jan 2019 • Kory W. Mathewson, Pablo Samuel Castro, Colin Cherry, George Foster, Marc G. Bellemare

We consider the problem of designing an artificial agent capable of interacting with humans in collaborative dialogue to produce creative, engaging narratives.

Specificity

Paper
Add Code

A Geometric Perspective on Optimal Representations for Reinforcement Learning

no code implementations • NeurIPS 2019 • Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle

We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

A Comparative Analysis of Expected and Distributional Reinforcement Learning

no code implementations • 30 Jan 2019 • Clare Lyle, Pablo Samuel Castro, Marc G. Bellemare

Since their introduction a year ago, distributional approaches to reinforcement learning (distributional RL) have produced strong results relative to the standard approach which models expected values (expected RL).

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents

1 code implementation • 17 Dec 2018 • Felipe Petroski Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Jiale Zhi, Ludwig Schubert, Marc G. Bellemare, Jeff Clune, Joel Lehman

We lessen this friction, by (1) training several algorithms at scale and releasing trained models, (2) integrating with a previous Deep RL model release, and (3) releasing code that makes it easy for anyone to load, visualize, and analyze such models.

Atari Games Friction +2

201

Paper
Code

Dopamine: A Research Framework for Deep Reinforcement Learning

12 code implementations • 14 Dec 2018 • Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

32,755

Paper
Code

Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation

no code implementations • 12 Nov 2018 • Pablo Samuel Castro, Maria Attarian

The use of language models for generating lyrics and poetry has received an increased interest in the last few years.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.