Search Results for author: Rishabh Agarwal

Found 32 papers, 19 papers with code

Evaluation Function Approximation for Scrabble

no code implementations • 25 Jan 2019 • Rishabh Agarwal

The current state-of-the-art Scrabble agents are not learning-based but depend on truncated Monte Carlo simulations and the quality of such agents is contingent upon the time available for running the simulations.

Bayesian Optimization Evolutionary Algorithms +1

Paper
Add Code

Learning to Generalize from Sparse and Underspecified Rewards

1 code implementation • 19 Feb 2019 • Rishabh Agarwal, Chen Liang, Dale Schuurmans, Mohammad Norouzi

The parameters of the auxiliary reward function are optimized with respect to the validation performance of a trained policy.

Bayesian Optimization Semantic Parsing

32,729

Paper
Code

An Optimistic Perspective on Offline Reinforcement Learning

1 code implementation • 10 Jul 2019 • Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi

The DQN replay dataset can serve as an offline RL benchmark and is open-sourced.

Atari Games DQN Replay Dataset +3

505

Paper
Code

Striving for Simplicity in Off-Policy Deep Reinforcement Learning

no code implementations • 25 Sep 2019 • Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi

This paper advocates the use of offline (batch) reinforcement learning (RL) to help (1) isolate the contributions of exploitation vs. exploration in off-policy deep RL, (2) improve reproducibility of deep RL research, and (3) facilitate the design of simpler deep RL algorithms.

Atari Games Offline RL +3

Paper
Add Code

Neural Additive Models: Interpretable Machine Learning with Neural Nets

6 code implementations • NeurIPS 2021 • Rishabh Agarwal, Levi Melnick, Nicholas Frosst, Xuezhou Zhang, Ben Lengerich, Rich Caruana, Geoffrey Hinton

They perform similarly to existing state-of-the-art generalized additive models in accuracy, but are more flexible because they are based on neural nets instead of boosted trees.

Additive models BIG-bench Machine Learning +3

32,729

Paper
Code

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

2 code implementations • 24 Jun 2020 • Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Atari Games DQN Replay Dataset +3

12,778

Paper
Code

Revisiting Fundamentals of Experience Replay

2 code implementations • ICML 2020 • William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but there remain significant gaps in our understanding.

DQN Replay Dataset Q-Learning +1

32,729

Paper
Code

IITK at SemEval-2020 Task 10: Transformers for Emphasis Selection

1 code implementation • SEMEVAL 2020 • Vipul Singhal, Sahil Dhull, Rishabh Agarwal, Ashutosh Modi

This paper describes the system proposed for addressing the research problem posed in Task 10 of SemEval-2020: Emphasis Selection For Written Text in Visual Media.

Paper
Code

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

1 code implementation • ICLR 2021 • Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.

reinforcement-learning Reinforcement Learning (RL)

505

Paper
Code

RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning

1 code implementation • NeurIPS 2020 • Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas

Offline RL reinforcement-learning +1

12,778

Paper
Code

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

1 code implementation • ICLR 2021 • Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare

Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states.

reinforcement-learning Reinforcement Learning (RL) +1

32,738

Paper
Code

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

1 code implementation • ICML Workshop URL 2021 • Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, Pierre-Luc Bacon

The shortcomings of maximum likelihood estimation in the context of model-based reinforcement learning have been highlighted by an increasing number of papers.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Deep Reinforcement Learning at the Edge of the Statistical Precipice

3 code implementations • NeurIPS 2021 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs.

reinforcement-learning Reinforcement Learning (RL)

694

Paper
Code

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

no code implementations • ICLR 2022 • Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine

In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.

Atari Games D4RL +3

Paper
Add Code

On the Generalization of Representations in Reinforcement Learning

1 code implementation • 1 Mar 2022 • Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.

Atari Games reinforcement-learning +2

32,735

Paper
Code

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

1 code implementation • 3 Jun 2022 • Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

To address these issues, we present reincarnating RL as an alternative workflow or class of problem settings, where prior computational work (e. g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another.

Atari Games Humanoid Control +2

Paper
Code

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

no code implementations • 28 Nov 2022 • Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine

The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP.

Offline RL Q-Learning +2

Paper
Add Code

A Novel Stochastic Gradient Descent Algorithm for Learning Principal Subspaces

no code implementations • 8 Dec 2022 • Charline Le Lan, Joshua Greaves, Jesse Farebrother, Mark Rowland, Fabian Pedregosa, Rishabh Agarwal, Marc G. Bellemare

In this paper, we derive an algorithm that learns a principal subspace from sample entries, can be applied when the approximate subspace is represented by a neural network, and hence can be scaled to datasets with an effectively infinite number of rows and columns.

Image Compression reinforcement-learning +1

Paper
Add Code

Revisiting Bellman Errors for Offline Model Selection

1 code implementation • 31 Jan 2023 • Joshua P. Zitovsky, Daniel de Marchi, Rishabh Agarwal, Michael R. Kosorok

Offline model selection (OMS), that is, choosing the best policy from a set of many policies given only logged data, is crucial for applying offline RL in real-world settings.

Atari Games Model Selection +1

Paper
Code

The Dormant Neuron Phenomenon in Deep Reinforcement Learning

1 code implementation • 24 Feb 2023 • Ghada Sokar, Rishabh Agarwal, Pablo Samuel Castro, Utku Evci

In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent's network suffers from an increasing number of inactive neurons, thereby affecting network expressivity.

reinforcement-learning Reinforcement Learning (RL)

10,367

Paper
Code

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

1 code implementation • 25 Apr 2023 • Jesse Farebrother, Joshua Greaves, Rishabh Agarwal, Charline Le Lan, Ross Goroshin, Pablo Samuel Castro, Marc G. Bellemare

Combined with a suitable off-policy learning rule, the result is a representation learning algorithm that can be understood as extending Mahadevan & Maggioni (2007)'s proto-value functions to deep reinforcement learning -- accordingly, we call the resulting object proto-value networks.

Atari Games reinforcement-learning +1

32,732

Paper
Code

Bigger, Better, Faster: Human-level Atari with human-level efficiency

3 code implementations • 30 May 2023 • Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark.

Ranked #1 on Atari Games 100k on Atari 100k

Atari Games 100k

32,733

Paper
Code

Bootstrapped Representations in Reinforcement Learning

no code implementations • 16 Jun 2023 • Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

Auxiliary Learning reinforcement-learning +1

Paper
Add Code

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

no code implementations • 23 Jun 2023 • Rishabh Agarwal, Nino Vieillard, Yongchao Zhou, Piotr Stanczyk, Sabela Ramos, Matthieu Geist, Olivier Bachem

Instead of solely relying on a fixed set of output sequences, GKD trains the student on its self-generated output sequences by leveraging feedback from the teacher on such sequences.

Arithmetic Reasoning Knowledge Distillation +1

Paper
Add Code

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

no code implementations • 12 Oct 2023 • Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, Rishabh Agarwal

Finally, in practical scenarios with models of varying sizes, first using distillation to boost the performance of the target model and then applying DistillSpec to train a well-aligned draft model can reduce decoding latency by 6-10x with minimal performance drop, compared to standard decoding without distillation.

Knowledge Distillation Language Modelling +1

Paper
Add Code

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

1 code implementation • 21 Nov 2023 • Max Schwarzer, Jesse Farebrother, Joshua Greaves, Ekin Dogus Cubuk, Rishabh Agarwal, Aaron Courville, Marc G. Bellemare, Sergei Kalinin, Igor Mordatch, Pablo Samuel Castro, Kevin M. Roccapriore

We introduce a machine learning approach to determine the transition dynamics of silicon atoms on a single layer of carbon atoms, when stimulated by the electron beam of a scanning transmission electron microscope (STEM).

Paper
Code

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

no code implementations • 11 Dec 2023 • Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron, Kathleen Kenealy, Kevin Swersky, Kshiteej Mahajan, Laura Culp, Lechao Xiao, Maxwell L. Bileschi, Noah Constant, Roman Novak, Rosanne Liu, Tris Warkentin, Yundi Qian, Yamini Bansal, Ethan Dyer, Behnam Neyshabur, Jascha Sohl-Dickstein, Noah Fiedel

To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times.

Math

Paper
Add Code

V-STaR: Training Verifiers for Self-Taught Reasoners

no code implementations • 9 Feb 2024 • Arian Hosseini, Xingdi Yuan, Nikolay Malkin, Aaron Courville, Alessandro Sordoni, Rishabh Agarwal

Common self-improvement approaches for large language models (LLMs), such as STaR (Zelikman et al., 2022), iteratively fine-tune LLMs on self-generated solutions to improve their problem-solving ability.

Code Generation Math

Paper
Add Code

Transformers Can Achieve Length Generalization But Not Robustly

no code implementations • 14 Feb 2024 • Yongchao Zhou, Uri Alon, Xinyun Chen, Xuezhi Wang, Rishabh Agarwal, Denny Zhou

We show that the success of length generalization is intricately linked to the data format and the type of position encoding.

Position

Paper
Add Code

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

no code implementations • 6 Mar 2024 • Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal

Observing this discrepancy, in this paper, we investigate whether the scalability of deep RL can also be improved simply by using classification in place of regression for training value functions.

Atari Games regression +1

Paper
Add Code

Many-Shot In-Context Learning

no code implementations • 17 Apr 2024 • Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Stephanie Chan, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases and can learn high-dimensional functions with numerical inputs.

Paper
Add Code

An Optimistic Perspective on Offline Deep Reinforcement Learning

1 code implementation • ICML 2020 • Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi

The DQN replay dataset can serve as an offline RL benchmark and is open-sourced.

Atari Games DQN Replay Dataset +3

505

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.