Search Results for author: Pulkit Agrawal

Found 76 papers, 32 papers with code

Value Augmented Sampling for Language Model Alignment and Personalization

1 code implementation10 May 2024 Seungwook Han, Idan Shenfeld, Akash Srivastava, Yoon Kim, Pulkit Agrawal

Aligning Large Language Models (LLMs) to cater to different human preferences, learning new skills, and unlearning harmful behavior is an important problem.

Language Modelling Reinforcement Learning (RL)

Learning Force Control for Legged Manipulation

no code implementations2 May 2024 Tifanny Portela, Gabriel B. Margolis, Yandong Ji, Pulkit Agrawal

We propose a method for training RL policies for direct force control without requiring access to force sensing.

Reinforcement Learning (RL)

JUICER: Data-Efficient Imitation Learning for Robotic Assembly

1 code implementation4 Apr 2024 Lars Ankile, Anthony Simeonov, Idan Shenfeld, Pulkit Agrawal

While learning from demonstrations is powerful for acquiring visuomotor policies, high-performance imitation without large demonstration datasets remains challenging for tasks requiring precise, long-horizon manipulation.

Data Augmentation Imitation Learning

Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation

no code implementations6 Mar 2024 Marcel Torne, Anthony Simeonov, Zechu Li, April Chan, Tao Chen, Abhishek Gupta, Pulkit Agrawal

To learn performant, robust policies without the burden of unsafe real-world data collection or extensive human supervision, we propose RialTo, a system for robustifying real-world imitation learning policies via reinforcement learning in "digital twin" simulation environments constructed on the fly from small amounts of real-world data.

Imitation Learning reinforcement-learning

Curiosity-driven Red-teaming for Large Language Models

1 code implementation29 Feb 2024 Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James Glass, Akash Srivastava, Pulkit Agrawal

To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i. e., test cases) that elicit undesirable responses from LLMs.

Reinforcement Learning (RL)

Training Neural Networks from Scratch with Parallel Low-Rank Adapters

no code implementations26 Feb 2024 Minyoung Huh, Brian Cheung, Jeremy Bernstein, Phillip Isola, Pulkit Agrawal

The scalability of deep learning models is fundamentally limited by computing resources, memory, and communication.

Learning to See Physical Properties with Active Sensing Motor Policies

no code implementations2 Nov 2023 Gabriel B. Margolis, Xiang Fu, Yandong Ji, Pulkit Agrawal

We show that the visual system trained with a small amount of real-world traversal data accurately predicts physical parameters.

Friction Image Classification

Autonomous Robotic Reinforcement Learning with Asynchronous Human Feedback

no code implementations31 Oct 2023 Max Balsells, Marcel Torne, Zihan Wang, Samedh Desai, Pulkit Agrawal, Abhishek Gupta

We evaluate this system on a suite of robotic tasks in simulation and demonstrate its effectiveness at learning behaviors both in simulation and the real world.

reinforcement-learning Self-Supervised Learning

Neuro-Inspired Fragmentation and Recall to Overcome Catastrophic Forgetting in Curiosity

1 code implementation26 Oct 2023 Jaedong Hwang, Zhang-Wei Hong, Eric Chen, Akhilan Boopathy, Pulkit Agrawal, Ila Fiete

Deep reinforcement learning methods exhibit impressive performance on a range of tasks but still struggle on hard exploration tasks in large environments with sparse rewards.

Lifelong Robot Learning with Human Assisted Language Planners

no code implementations25 Sep 2023 Meenal Parakh, Alisha Fong, Anthony Simeonov, Tao Chen, Abhishek Gupta, Pulkit Agrawal

Large Language Models (LLMs) have been shown to act like planners that can decompose high-level instructions into a sequence of executable instructions.

Parallel $Q$-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

no code implementations24 Jul 2023 Zechu Li, Tao Chen, Zhang-Wei Hong, Anurag Ajay, Pulkit Agrawal

This paper presents a Parallel $Q$-Learning (PQL) scheme that outperforms PPO in wall-clock time while maintaining superior sample efficiency of off-policy learning.

Q-Learning reinforcement-learning

Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback

1 code implementation20 Jul 2023 Marcel Torne, Max Balsells, Zihan Wang, Samedh Desai, Tao Chen, Pulkit Agrawal, Abhishek Gupta

This procedure can leverage noisy, asynchronous human feedback to learn policies with no hand-crafted reward design or exploration bonuses.

Decision Making reinforcement-learning +1

Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

no code implementations12 Jul 2023 Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal

Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments.

Continuous Control counterfactual +1

Grid Cell-Inspired Fragmentation and Recall for Efficient Map Building

1 code implementation11 Jul 2023 Jaedong Hwang, Zhang-Wei Hong, Eric Chen, Akhilan Boopathy, Pulkit Agrawal, Ila Fiete

Agents build and use a local map to predict their observations; high surprisal leads to a "fragmentation event" that truncates the local map.

Clustering Navigate +1

Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

no code implementations10 Jul 2023 Anthony Simeonov, Ankit Goyal, Lucas Manuelli, Lin Yen-Chen, Alina Sarmiento, Alberto Rodriguez, Pulkit Agrawal, Dieter Fox

We propose a system for rearranging objects in a scene to achieve a desired object-scene placing relationship, such as a book inserted in an open slot of a bookshelf.

TGRL: An Algorithm for Teacher Guided Reinforcement Learning

no code implementations6 Jul 2023 Idan Shenfeld, Zhang-Wei Hong, Aviv Tamar, Pulkit Agrawal

To combine the benefits of these different forms of learning, it is common to train a policy to maximize a combination of reinforcement and teacher-student learning objectives.

counterfactual Decision Making +1

Straightening Out the Straight-Through Estimator: Overcoming Optimization Challenges in Vector Quantized Networks

no code implementations15 May 2023 Minyoung Huh, Brian Cheung, Pulkit Agrawal, Phillip Isola

We identify the factors that contribute to this issue, including the codebook gradient sparsity and the asymmetric nature of the commitment loss, which leads to misaligned code-vector assignments.

Image Classification Quantization

Learning to Extrapolate: A Transductive Approach

1 code implementation27 Apr 2023 Aviv Netanyahu, Abhishek Gupta, Max Simchowitz, Kaiqing Zhang, Pulkit Agrawal

Machine learning systems, especially with overparameterized deep neural networks, can generalize to novel test instances drawn from the same distribution as the training data.

Imitation Learning

DribbleBot: Dynamic Legged Manipulation in the Wild

no code implementations3 Apr 2023 Yandong Ji, Gabriel B. Margolis, Pulkit Agrawal

DribbleBot (Dexterous Ball Manipulation with a Legged Robot) is a legged robotic system that can dribble a soccer ball under the same real-world conditions as humans (i. e., in-the-wild).

reinforcement-learning

TactoFind: A Tactile Only System for Object Retrieval

no code implementations23 Mar 2023 Sameer Pai, Tao Chen, Megha Tippur, Edward Adelson, Abhishek Gupta, Pulkit Agrawal

We study the problem of object retrieval in scenarios where visual sensing is absent, object shapes are unknown beforehand and objects can move freely, like grabbing objects out of a drawer.

Object Retrieval

Statistical Learning under Heterogeneous Distribution Shift

no code implementations27 Feb 2023 Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

We show that, when the class $F$ is "simpler" than $G$ (measured, e. g., in terms of its metric entropy), our predictor is more resilient to heterogeneous covariate shifts} in which the shift in $\mathbf{x}$ is much greater than that in $\mathbf{y}$.

Aligning Robot and Human Representations

no code implementations3 Feb 2023 Andreea Bobu, Andi Peng, Pulkit Agrawal, Julie Shah, Anca D. Dragan

To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior.

Imitation Learning Representation Learning

Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior

no code implementations6 Dec 2022 Gabriel B Margolis, Pulkit Agrawal

Learned locomotion policies can rapidly adapt to diverse environments similar to those experienced during training but lack a mechanism for fast tuning when they fail in an out-of-distribution test environment.

Is Conditional Generative Modeling all you need for Decision-Making?

no code implementations28 Nov 2022 Anurag Ajay, Yilun Du, Abhi Gupta, Joshua Tenenbaum, Tommi Jaakkola, Pulkit Agrawal

We further demonstrate the advantages of modeling policies as conditional diffusion models by considering two other conditioning variables: constraints and skills.

Decision Making Offline RL +1

Discovering Generalizable Spatial Goal Representations via Graph-based Active Reward Learning

no code implementations24 Nov 2022 Aviv Netanyahu, Tianmin Shu, Joshua Tenenbaum, Pulkit Agrawal

To address this, we propose a reward learning approach, Graph-based Equivalence Mappings (GEM), that can discover spatial goal representations that are aligned with the intended goal specification, enabling successful generalization in unseen environments.

AI Agent Imitation Learning

Visual Dexterity: In-Hand Reorientation of Novel and Complex Object Shapes

1 code implementation21 Nov 2022 Tao Chen, Megha Tippur, Siyang Wu, Vikash Kumar, Edward Adelson, Pulkit Agrawal

The controller is trained using reinforcement learning in simulation and evaluated in the real world on new object shapes not used for training, including the most challenging scenario of reorienting objects held in the air by a downward-facing hand that must counteract gravity during reorientation.

Object

SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields

1 code implementation17 Nov 2022 Anthony Simeonov, Yilun Du, Lin Yen-Chen, Alberto Rodriguez, Leslie Pack Kaelbling, Tomas Lozano-Perez, Pulkit Agrawal

This formalism is implemented in three steps: assigning a consistent local coordinate frame to the task-relevant object parts, determining the location and orientation of this coordinate frame on unseen object instances, and executing an action that brings these frames into the desired alignment.

Object

Redeeming Intrinsic Rewards via Constrained Optimization

1 code implementation14 Nov 2022 Eric Chen, Zhang-Wei Hong, Joni Pajarinen, Pulkit Agrawal

However, on easy exploration tasks, the agent gets distracted by intrinsic rewards and performs unnecessary exploration even when sufficient task (also called extrinsic) reward is available.

Montezuma's Revenge Reinforcement Learning (RL)

Distributionally Adaptive Meta Reinforcement Learning

no code implementations6 Oct 2022 Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +1

Stable Object Reorientation using Contact Plane Registration

no code implementations18 Aug 2022 Richard Li, Carlos Esteves, Ameesh Makadia, Pulkit Agrawal

We present a system for accurately predicting stable orientations for diverse rigid objects.

Object

Offline RL Policies Should be Trained to be Adaptive

no code implementations5 Jul 2022 Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown.

Offline RL

Visual Pre-training for Navigation: What Can We Learn from Noise?

1 code implementation30 Jun 2022 Yanwei Wang, Ching-Yun Ko, Pulkit Agrawal

One powerful paradigm in visual navigation is to predict actions from observations directly.

Inductive Bias Navigate +1

Overcoming the Spectral Bias of Neural Value Approximation

no code implementations ICLR 2022 Ge Yang, Anurag Ajay, Pulkit Agrawal

Value approximation using deep neural networks is at the heart of off-policy deep reinforcement learning, and is often the primary module that provides learning signals to the rest of the algorithm.

Continuous Control regression +2

Rapid Locomotion via Reinforcement Learning

no code implementations5 May 2022 Gabriel B Margolis, Ge Yang, Kartik Paigwar, Tao Chen, Pulkit Agrawal

Agile maneuvers such as sprinting and high-speed turning in the wild are challenging for legged robots.

reinforcement-learning Reinforcement Learning (RL)

Bilinear value networks

1 code implementation28 Apr 2022 Zhang-Wei Hong, Ge Yang, Pulkit Agrawal

The dominant framework for off-policy multi-goal reinforcement learning involves estimating goal conditioned Q-value function.

Multi-Goal Reinforcement Learning

Topological Experience Replay

1 code implementation ICLR 2022 Zhang-Wei Hong, Tao Chen, Yen-Chen Lin, Joni Pajarinen, Pulkit Agrawal

State-of-the-art deep Q-learning methods update Q-values using state transition tuples sampled from the experience replay buffer.

Q-Learning

Stubborn: A Strong Baseline for Indoor Object Navigation

no code implementations14 Mar 2022 Haokuan Luo, Albert Yue, Zhang-Wei Hong, Pulkit Agrawal

We present a strong baseline that surpasses the performance of previously published methods on the Habitat Challenge task of navigating to a target object in indoor environments.

Object

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation

1 code implementation9 Dec 2021 Anthony Simeonov, Yilun Du, Andrea Tagliasacchi, Joshua B. Tenenbaum, Alberto Rodriguez, Pulkit Agrawal, Vincent Sitzmann

Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors.

Object

A System for General In-Hand Object Re-Orientation

1 code implementation4 Nov 2021 Tao Chen, Jie Xu, Pulkit Agrawal

The videos of the learned policies are available at: https://taochenshh. github. io/projects/in-hand-reorientation.

Object

Equivariant Contrastive Learning

2 code implementations28 Oct 2021 Rumen Dangovski, Li Jing, Charlotte Loh, Seungwook Han, Akash Srivastava, Brian Cheung, Pulkit Agrawal, Marin Soljačić

In state-of-the-art self-supervised learning (SSL) pre-training produces semantically good representations by encouraging them to be invariant under meaningful transformations prescribed from human knowledge.

Contrastive Learning Self-Supervised Learning

Equivariant Self-Supervised Learning: Encouraging Equivariance in Representations

no code implementations ICLR 2022 Rumen Dangovski, Li Jing, Charlotte Loh, Seungwook Han, Akash Srivastava, Brian Cheung, Pulkit Agrawal, Marin Soljacic

In state-of-the-art self-supervised learning (SSL) pre-training produces semantically good representations by encouraging them to be invariant under meaningful transformations prescribed from human knowledge.

Self-Supervised Learning

Understanding the Generalization Gap in Visual Reinforcement Learning

no code implementations29 Sep 2021 Anurag Ajay, Ge Yang, Ofir Nachum, Pulkit Agrawal

Deep Reinforcement Learning (RL) agents have achieved superhuman performance on several video game suites.

Data Augmentation reinforcement-learning +1

An End-to-End Differentiable Framework for Contact-Aware Robot Design

1 code implementation15 Jul 2021 Jie Xu, Tao Chen, Lara Zlokapa, Michael Foshey, Wojciech Matusik, Shinjiro Sueda, Pulkit Agrawal

Existing methods for co-optimization are limited and fail to explore a rich space of designs.

Learning Task Informed Abstractions

1 code implementation29 Jun 2021 Xiang Fu, Ge Yang, Pulkit Agrawal, Tommi Jaakkola

Current model-based reinforcement learning methods struggle when operating from complex visual scenes due to their inability to prioritize task-relevant features.

Model-based Reinforcement Learning reinforcement-learning +1

Residual Model Learning for Microrobot Control

no code implementations1 Apr 2021 Joshua Gruenstein, Tao Chen, Neel Doshi, Pulkit Agrawal

RML provides a general framework for learning from extremely small amounts of interaction data, and our experiments with HAMR clearly demonstrate that RML substantially outperforms existing techniques.

The Low-Rank Simplicity Bias in Deep Networks

1 code implementation18 Mar 2021 Minyoung Huh, Hossein Mobahi, Richard Zhang, Brian Cheung, Pulkit Agrawal, Phillip Isola

We show empirically that our claim holds true on finite width linear and non-linear models on practical learning paradigms and show that on natural data, these are often the solutions that generalize well.

Image Classification

Learning to Recover from Failures using Memory

no code implementations1 Jan 2021 Tao Chen, Pulkit Agrawal

Learning from past mistakes is a quintessential aspect of intelligence.

Decision Making Meta-Learning

A Long Horizon Planning Framework for Manipulating Rigid Pointcloud Objects

no code implementations16 Nov 2020 Anthony Simeonov, Yilun Du, Beomjoon Kim, Francois R. Hogan, Joshua Tenenbaum, Pulkit Agrawal, Alberto Rodriguez

We present a framework for solving long-horizon planning problems involving manipulation of rigid objects that operates directly from a point-cloud observation, i. e. without prior object models.

Graph Attention Motion Planning +2

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

no code implementations ICLR 2021 Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum

Reinforcement learning (RL) has achieved impressive performance in a variety of online settings in which an agent's ability to query the environment for transitions and rewards is effectively unlimited.

Few-Shot Imitation Learning Imitation Learning +3

AdaScale SGD: A User-Friendly Algorithm for Distributed Training

1 code implementation ICML 2020 Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using large-batch training to speed up stochastic gradient descent, learning rates must adapt to new batch sizes in order to maximize speed-ups and preserve model quality.

Image Classification Machine Translation +5

Exploring Exploration: Comparing Children with RL Agents in Unified Environments

1 code implementation6 May 2020 Eliza Kosoy, Jasmine Collins, David M. Chan, Sandy Huang, Deepak Pathak, Pulkit Agrawal, John Canny, Alison Gopnik, Jessica B. Hamrick

Research in developmental psychology consistently shows that children explore the world thoroughly and efficiently and that this exploration allows them to learn.

Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning

1 code implementation23 Dec 2019 Richard Li, Allan Jabri, Trevor Darrell, Pulkit Agrawal

Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements.

Object reinforcement-learning +2

AdaScale SGD: A Scale-Invariant Algorithm for Distributed Training

no code implementations25 Sep 2019 Tyler B. Johnson, Pulkit Agrawal, Haijie Gu, Carlos Guestrin

When using distributed training to speed up stochastic gradient descent, learning rates must adapt to new scales in order to maintain training effectiveness.

Image Classification Machine Translation +5

Classification in the dark using tactile exploration

no code implementations ICLR 2019 Mayur Mudigonda, Blake Tickell, Pulkit Agrawal

Combining information from different sensory modalities to execute goal directed actions is a key aspect of human intelligence.

Classification General Classification +2

Superposition of many models into one

1 code implementation NeurIPS 2019 Brian Cheung, Alex Terekhov, Yubei Chen, Pulkit Agrawal, Bruno Olshausen

We present a method for storing multiple models within a single set of parameters.

Learning Instance Segmentation by Interaction

1 code implementation21 Jun 2018 Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik

The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.

Instance Segmentation Segmentation +1

What Will Happen Next? Forecasting Player Moves in Sports Videos

no code implementations ICCV 2017 Panna Felsen, Pulkit Agrawal, Jitendra Malik

A large number of very popular team sports involve the act of one team trying to score a goal against the other.

Learning to Perform Physics Experiments via Deep Reinforcement Learning

no code implementations6 Nov 2016 Misha Denil, Pulkit Agrawal, Tejas D. Kulkarni, Tom Erez, Peter Battaglia, Nando de Freitas

When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way.

Friction reinforcement-learning +1

Learning Visual Predictive Models of Physics for Playing Billiards

no code implementations23 Nov 2015 Katerina Fragkiadaki, Pulkit Agrawal, Sergey Levine, Jitendra Malik

The ability to plan and execute goal specific actions in varied, unexpected settings is a central requirement of intelligent agents.

Human Pose Estimation with Iterative Error Feedback

1 code implementation CVPR 2016 Joao Carreira, Pulkit Agrawal, Katerina Fragkiadaki, Jitendra Malik

Hierarchical feature extractors such as Convolutional Networks (ConvNets) have achieved impressive performance on a variety of classification tasks using purely feedforward processing.

Pose Estimation Semantic Segmentation

Learning to See by Moving

no code implementations ICCV 2015 Pulkit Agrawal, Joao Carreira, Jitendra Malik

We show that given the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on visual tasks of scene recognition, object recognition, visual odometry and keypoint matching.

Object Recognition Scene Recognition +1

Pixels to Voxels: Modeling Visual Representation in the Human Brain

no code implementations18 Jul 2014 Pulkit Agrawal, Dustin Stansbury, Jitendra Malik, Jack L. Gallant

We find that both classes of models accurately predict brain activity in high-level visual areas, directly from pixels and without the need for any semantic tags or hand annotation of images.

BIG-bench Machine Learning Object Recognition

Analyzing the Performance of Multilayer Neural Networks for Object Recognition

no code implementations7 Jul 2014 Pulkit Agrawal, Ross Girshick, Jitendra Malik

In the last two years, convolutional neural networks (CNNs) have achieved an impressive suite of results on standard recognition datasets and tasks.

Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.