Search Results for author: Anurag Ajay

Found 12 papers, 1 papers with code

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

no code implementations24 Jul 2023 Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal

This paper presents a Parallel $Q$-Learning (PQL) scheme that outperforms PPO in wall-clock time while maintaining superior sample efficiency of off-policy learning.

Q-Learning reinforcement-learning

Statistical Learning under Heterogeneous Distribution Shift

no code implementations27 Feb 2023 Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

We show that, when the class $F$ is "simpler" than $G$ (measured, e. g., in terms of its metric entropy), our predictor is more resilient to heterogeneous covariate shifts} in which the shift in $\mathbf{x}$ is much greater than that in $\mathbf{y}$.

Is Conditional Generative Modeling all you need for Decision-Making?

no code implementations28 Nov 2022 Anurag Ajay, Yilun Du, Abhi Gupta, Joshua Tenenbaum, Tommi Jaakkola, Pulkit Agrawal

We further demonstrate the advantages of modeling policies as conditional diffusion models by considering two other conditioning variables: constraints and skills.

Decision Making Offline RL +1

Distributionally Adaptive Meta Reinforcement Learning

no code implementations6 Oct 2022 Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +2

Offline RL Policies Should be Trained to be Adaptive

no code implementations5 Jul 2022 Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown.

Offline RL

Overcoming the Spectral Bias of Neural Value Approximation

no code implementations ICLR 2022 Ge Yang, Anurag Ajay, Pulkit Agrawal

Value approximation using deep neural networks is at the heart of off-policy deep reinforcement learning, and is often the primary module that provides learning signals to the rest of the algorithm.

Continuous Control regression +2

Understanding the Generalization Gap in Visual Reinforcement Learning

no code implementations29 Sep 2021 Anurag Ajay, Ge Yang, Ofir Nachum, Pulkit Agrawal

Deep Reinforcement Learning (RL) agents have achieved superhuman performance on several video game suites.

Data Augmentation reinforcement-learning +1

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

no code implementations ICLR 2021 Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum

Reinforcement learning (RL) has achieved impressive performance in a variety of online settings in which an agent's ability to query the environment for transitions and rewards is effectively unlimited.

Few-Shot Imitation Learning Imitation Learning +3

Combining Physical Simulators and Object-Based Networks for Control

no code implementations13 Apr 2019 Anurag Ajay, Maria Bauza, Jiajun Wu, Nima Fazeli, Joshua B. Tenenbaum, Alberto Rodriguez, Leslie P. Kaelbling

Physics engines play an important role in robot planning and control; however, many real-world control problems involve complex contact dynamics that cannot be characterized analytically.

Object

Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing

no code implementations9 Aug 2018 Anurag Ajay, Jiajun Wu, Nima Fazeli, Maria Bauza, Leslie P. Kaelbling, Joshua B. Tenenbaum, Alberto Rodriguez

An efficient, generalizable physical simulator with universal uncertainty estimates has wide applications in robot state estimation, planning, and control.

Gaussian Processes Object

Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

no code implementations4 Oct 2016 William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering.

reinforcement-learning Reinforcement Learning (RL)

Backprop KF: Learning Discriminative Deterministic State Estimators

1 code implementation NeurIPS 2016 Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.

Autonomous Vehicles Visual Odometry

Cannot find the paper you are looking for? You can Submit a new open access paper.