Search Results for author: Pieter Abbeel

Found 348 papers, 188 papers with code

Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings

no code implementations • ICML 2018 • John D. Co-Reyes, Yuxuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine

We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems.

Hierarchical Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Learning Generalized Reactive Policies using Deep Neural Networks

no code implementations • 24 Aug 2017 • Edward Groshev, Maxwell Goldstein, Aviv Tamar, Siddharth Srivastava, Pieter Abbeel

We show that a deep neural network can be used to learn and represent a \emph{generalized reactive policy} (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances.

Decision Making feature selection

Paper
Add Code

Latent Space Policies for Hierarchical Reinforcement Learning

no code implementations • ICML 2018 • Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Domain Randomization and Generative Models for Robotic Grasping

no code implementations • 17 Oct 2017 • Joshua Tobin, Lukas Biewald, Rocky Duan, Marcin Andrychowicz, Ankur Handa, Vikash Kumar, Bob McGrew, Jonas Schneider, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis.

Object Robotic Grasping

Paper
Add Code

Learning Robotic Assembly from CAD

no code implementations • 20 Mar 2018 • Garrett Thomas, Melissa Chien, Aviv Tamar, Juan Aparicio Ojea, Pieter Abbeel

We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data.

Motion Planning Reinforcement Learning (RL)

Paper
Add Code

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

no code implementations • ICLR 2018 • Cathy Wu, Aravind Rajeswaran, Yan Duan, Vikash Kumar, Alexandre M. Bayen, Sham Kakade, Igor Mordatch, Pieter Abbeel

To mitigate this issue, we derive a bias-free action-dependent baseline for variance reduction which fully exploits the structural form of the stochastic policy itself and does not make any additional assumptions about the MDP.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

Interpretable and Pedagogical Examples

no code implementations • ICLR 2018 • Smitha Milli, Pieter Abbeel, Igor Mordatch

Teachers intentionally pick the most informative examples to show their students.

Paper
Add Code

A Berkeley View of Systems Challenges for AI

no code implementations • 15 Dec 2017 • Ion Stoica, Dawn Song, Raluca Ada Popa, David Patterson, Michael W. Mahoney, Randy Katz, Anthony D. Joseph, Michael Jordan, Joseph M. Hellerstein, Joseph E. Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, Pieter Abbeel

With the increasing commoditization of computer vision, speech recognition and machine translation systems and the widespread deployment of learning-based back-end technologies such as digital advertising and intelligent infrastructures, AI (Artificial Intelligence) has moved from research labs to production.

Machine Translation speech-recognition +1

Paper
Add Code

One-Shot Imitation Learning

no code implementations • NeurIPS 2017 • Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba

A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration.

Feature Engineering Imitation Learning +1

Paper
Add Code

Safer Classification by Synthesis

no code implementations • 22 Nov 2017 • William Wang, Angelina Wang, Aviv Tamar, Xi Chen, Pieter Abbeel

We posit that a generative approach is the natural remedy for this problem, and propose a method for classification using generative models.

Classification General Classification

Paper
Add Code

Equivalence Between Policy Gradients and Soft Q-Learning

no code implementations • 21 Apr 2017 • John Schulman, Xi Chen, Pieter Abbeel

A partial explanation may be that $Q$-learning methods are secretly implementing policy gradient updates: we show that there is a precise equivalence between $Q$-learning and policy gradient methods in the setting of entropy-regularized reinforcement learning, that "soft" (entropy-regularized) $Q$-learning is exactly equivalent to a policy gradient method.

Policy Gradient Methods Q-Learning +2

Paper
Add Code

Inverse Reward Design

1 code implementation • NeurIPS 2017 • Dylan Hadfield-Menell, Smitha Milli, Pieter Abbeel, Stuart Russell, Anca Dragan

When designing the reward, we might think of some specific training scenarios, and make sure that the reward will lead to the right behavior in those scenarios.

Paper
Code

UCB Exploration via Q-Ensembles

no code implementations • ICLR 2018 • Richard Y. Chen, Szymon Sidor, Pieter Abbeel, John Schulman

We show how an ensemble of $Q^*$-functions can be leveraged for more effective exploration in deep reinforcement learning.

Q-Learning reinforcement-learning +1

Paper
Add Code

Asymmetric Actor Critic for Image-Based Robot Learning

no code implementations • 18 Oct 2017 • Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Reverse Curriculum Generation for Reinforcement Learning

no code implementations • 17 Jul 2017 • Carlos Florensa, David Held, Markus Wulfmeier, Michael Zhang, Pieter Abbeel

The robot is trained in reverse, gradually learning to reach the goal from a set of start states increasingly far from the goal.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Mutual Alignment Transfer Learning

no code implementations • 25 Jul 2017 • Markus Wulfmeier, Ingmar Posner, Pieter Abbeel

Training robots for operation in the real world is a complex, time consuming and potentially expensive task.

Transfer Learning

Paper
Add Code

Prediction and Control with Temporal Segment Models

no code implementations • ICML 2017 • Nikhil Mishra, Pieter Abbeel, Igor Mordatch

We introduce a method for learning the dynamics of complex nonlinear systems based on deep generative models over temporal segments of states and actions.

Paper
Add Code

The Off-Switch Game

no code implementations • 24 Nov 2016 • Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russell

We analyze a simple game between a human H and a robot R, where H can press R's off switch but R can disable the off switch.

Paper
Add Code

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

no code implementations • 23 Nov 2015 • Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell

We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains.

Domain Adaptation

Paper
Add Code

Probabilistically Safe Policy Transfer

no code implementations • 15 May 2017 • David Held, Zoe McCarthy, Michael Zhang, Fred Shentu, Pieter Abbeel

Although learning-based methods have great potential for robotics, one concern is that a robot that updates its parameters might cause large amounts of damage before it learns the optimal policy.

Paper
Add Code

Inverse Reinforcement Learning via Deep Gaussian Process

no code implementations • 26 Dec 2015 • Ming Jin, Andreas Damianou, Pieter Abbeel, Costas Spanos

We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstration

no code implementations • 21 Mar 2016 • Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel

In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.

Paper
Add Code

Generalizing Skills with Semi-Supervised Reinforcement Learning

no code implementations • 1 Dec 2016 • Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine

We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning

no code implementations • 8 Mar 2017 • Abhishek Gupta, Coline Devin, Yuxuan Liu, Pieter Abbeel, Sergey Levine

People can learn a wide range of tasks from their own experience, but can also learn from observing other creatures.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Deep Reinforcement Learning for Tensegrity Robot Locomotion

no code implementations • 28 Sep 2016 • Marvin Zhang, Xinyang Geng, Jonathan Bruce, Ken Caluwaerts, Massimo Vespignani, Vytas SunSpiral, Pieter Abbeel, Sergey Levine

We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Combining Self-Supervised Learning and Imitation for Vision-Based Rope Manipulation

no code implementations • 6 Mar 2017 • Ashvin Nair, Dian Chen, Pulkit Agrawal, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine

Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics.

Self-Supervised Learning

Paper
Add Code

Variational Lossy Autoencoder

no code implementations • 8 Nov 2016 • Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, Pieter Abbeel

Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification.

Density Estimation Image Generation +1

Paper
Add Code

PLATO: Policy Learning using Adaptive Trajectory Optimization

no code implementations • 2 Mar 2016 • Gregory Kahn, Tianhao Zhang, Sergey Levine, Pieter Abbeel

PLATO also maintains the MPC cost as an objective to avoid highly undesirable actions that would result from strictly following the learned policy before it has been fully trained.

Model Predictive Control

Paper
Add Code

Enabling Robots to Communicate their Objectives

no code implementations • 11 Feb 2017 • Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan

We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations.

Autonomous Driving

Paper
Add Code

Uncertainty-Aware Reinforcement Learning for Collision Avoidance

no code implementations • 3 Feb 2017 • Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine

However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot.

Collision Avoidance Navigate +2

Paper
Add Code

A K-fold Method for Baseline Estimation in Policy Gradient Algorithms

no code implementations • 3 Jan 2017 • Nithyanand Kota, Abhishek Mishra, Sunil Srinivasa, Xi, Chen, Pieter Abbeel

The high variance issue in unbiased policy-gradient methods such as VPG and REINFORCE is typically mitigated by adding a baseline.

Policy Gradient Methods

Paper
Add Code

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

no code implementations • 11 Oct 2016 • Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba

Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world.

Friction

Paper
Add Code

Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

no code implementations • 4 Oct 2016 • William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Combinatorial Energy Learning for Image Segmentation

no code implementations • NeurIPS 2016 • Jeremy Maitin-Shepard, Viren Jain, Michal Januszewski, Peter Li, Pieter Abbeel

We introduce a new machine learning approach for image segmentation that uses a neural network to model the conditional energy of a segmentation given an image.

Image Segmentation Segmentation +1

Paper
Add Code

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

no code implementations • 22 Sep 2016 • Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors

no code implementations • 23 Sep 2015 • Justin Fu, Sergey Levine, Pieter Abbeel

One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand.

Model-based Reinforcement Learning Model Predictive Control +3

Paper
Add Code

End-to-End Training of Deep Visuomotor Policies

no code implementations • 2 Apr 2015 • Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel

Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control.

Paper
Add Code

Model-based Reinforcement Learning with Parametrized Physical Models and Optimism-Driven Exploration

no code implementations • 23 Sep 2015 • Christopher Xie, Sachin Patil, Teodor Moldovan, Sergey Levine, Pieter Abbeel

In this paper, we present a robotic model-based reinforcement learning method that combines ideas from model identification and model predictive control.

Model-based Reinforcement Learning Model Predictive Control +2

Paper
Add Code

Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

no code implementations • 22 Sep 2015 • Tianhao Zhang, Gregory Kahn, Sergey Levine, Pieter Abbeel

We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment.

Model Predictive Control reinforcement-learning +1

Paper
Add Code

Learning Deep Neural Network Policies with Continuous Memory States

no code implementations • 5 Jul 2015 • Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel

We evaluate our method on tasks involving continuous control in manipulation and navigation settings, and show that our method can learn complex policies that successfully complete a range of tasks that require memory.

Continuous Control Memorization

Paper
Add Code

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

no code implementations • 18 Oct 2017 • Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained.

Robotics Systems and Control

Paper
Add Code

Variational Option Discovery Algorithms

no code implementations • 26 Jul 2018 • Joshua Achiam, Harrison Edwards, Dario Amodei, Pieter Abbeel

We explore methods for option discovery based on variational inference and make two algorithmic contributions.

Decoder Variational Inference

Paper
Add Code

Transfer Learning for Estimating Causal Effects using Neural Networks

no code implementations • 23 Aug 2018 • Sören R. Künzel, Bradly C. Stadie, Nikita Vemuri, Varsha Ramakrishnan, Jasjeet S. Sekhon, Pieter Abbeel

We develop new algorithms for estimating heterogeneous treatment effects, combining recent developments in transfer learning for neural networks with insights from the causal inference literature.

Causal Inference Transfer Learning

Paper
Add Code

One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks

no code implementations • 25 Oct 2018 • Tianhe Yu, Pieter Abbeel, Sergey Levine, Chelsea Finn

We consider the problem of learning multi-stage vision-based tasks on a real robot from a single video of a human performing the task, while leveraging demonstration data of subtasks with other objects.

Imitation Learning

Paper
Add Code

Modular Architecture for StarCraft II with Deep Reinforcement Learning

no code implementations • 8 Nov 2018 • Dennis Lee, Haoran Tang, Jeffrey O. Zhang, Huazhe Xu, Trevor Darrell, Pieter Abbeel

We present a novel modular architecture for StarCraft II AI.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

An Algorithmic Perspective on Imitation Learning

no code implementations • 16 Nov 2018 • Takayuki Osa, Joni Pajarinen, Gerhard Neumann, J. Andrew Bagnell, Pieter Abbeel, Jan Peters

This process of learning from demonstrations, and the study of algorithms to do so, is called imitation learning.

Imitation Learning Learning Theory

Paper
Add Code

The Importance of Sampling inMeta-Reinforcement Learning

no code implementations • NeurIPS 2018 • Bradly Stadie, Ge Yang, Rein Houthooft, Peter Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever

Results are presented on a new environment we call `Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics

no code implementations • NeurIPS 2014 • Sergey Levine, Pieter Abbeel

We present a policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems.

Paper
Add Code

On a Connection between Importance Sampling and the Likelihood Ratio Policy Gradient

no code implementations • NeurIPS 2010 • Tang Jie, Pieter Abbeel

Likelihood ratio policy gradient methods have been some of the most successful reinforcement learning algorithms, especially for learning on physical systems.

Policy Gradient Methods

Paper
Add Code

Learning to Reinforcement Learn by Imitation

no code implementations • ICLR 2019 • Rosen Kralev, Russell Mendonca, Alvin Zhang, Tianhe Yu, Abhishek Gupta, Pieter Abbeel, Sergey Levine, Chelsea Finn

Meta-reinforcement learning aims to learn fast reinforcement learning (RL) procedures that can be applied to new tasks or environments.

Meta-Learning Meta Reinforcement Learning +2

Paper
Add Code

Inferring Reward Functions from Demonstrators with Unknown Biases

no code implementations • ICLR 2019 • Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca Dragan

Our goal is to infer reward functions from demonstrations.

Paper
Add Code

Self-Supervised Learning of Object Motion Through Adversarial Video Prediction

no code implementations • ICLR 2018 • Alex X. Lee, Frederik Ebert, Richard Zhang, Chelsea Finn, Pieter Abbeel, Sergey Levine

In this paper, we study the problem of multi-step video prediction, where the goal is to predict a sequence of future frames conditioned on a short context.

Object Self-Supervised Learning +1

Paper
Add Code

Domain Randomization for Active Pose Estimation

no code implementations • 10 Mar 2019 • Xinyi Ren, Jianlan Luo, Eugen Solowjow, Juan Aparicio Ojea, Abhishek Gupta, Aviv Tamar, Pieter Abbeel

In this work, we investigate how to improve the accuracy of domain randomization based pose estimation.

Pose Estimation

Paper
Add Code

Towards Characterizing Divergence in Deep Q-Learning

no code implementations • 21 Mar 2019 • Joshua Achiam, Ethan Knight, Pieter Abbeel

Deep Q-Learning (DQL), a family of temporal difference algorithms for control, employs three techniques collectively known as the `deadly triad' in reinforcement learning: bootstrapping, off-policy learning, and function approximation.

Continuous Control OpenAI Gym +1

Paper
Add Code

Guided Meta-Policy Search

no code implementations • NeurIPS 2019 • Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.

Continuous Control Imitation Learning +4

Paper
Add Code

Learning Robotic Manipulation through Visual Planning and Acting

no code implementations • 11 May 2019 • Angelina Wang, Thanard Kurutach, Kara Liu, Pieter Abbeel, Aviv Tamar

We further demonstrate our approach on learning to imagine and execute in 3 environments, the final of which is deformable rope manipulation on a PR2 robot.

Visual Tracking

Paper
Add Code

Learning latent state representation for speeding up exploration

no code implementations • 27 May 2019 • Giulia Vezzani, Abhishek Gupta, Lorenzo Natale, Pieter Abbeel

In this work, we take a representation learning viewpoint on exploration, utilizing prior experience to learn effective latent representations, which can subsequently indicate which regions to explore.

Representation Learning

Paper
Add Code

Sub-policy Adaptation for Hierarchical Reinforcement Learning

no code implementations • ICLR 2020 • Alexander C. Li, Carlos Florensa, Ignasi Clavera, Pieter Abbeel

Hierarchical reinforcement learning is a promising approach to tackle long-horizon decision-making problems with sparse rewards.

Decision Making Hierarchical Reinforcement Learning +2

Paper
Add Code

On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference

no code implementations • 23 Jun 2019 • Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca D. Dragan

But in the era of deep learning, a natural suggestion researchers make is to avoid mathematical models of human behavior that are fraught with specific assumptions, and instead use a purely data-driven approach.

Paper
Add Code

BagNet: Berkeley Analog Generator with Layout Optimizer Boosted with Deep Neural Networks

no code implementations • 23 Jul 2019 • Kourosh Hakhamaneshi, Nick Werblun, Pieter Abbeel, Vladimir Stojanovic

The discrepancy between post-layout and schematic simulation results continues to widen in analog design due in part to the domination of layout parasitics.

Paper
Add Code

Likelihood Contribution based Multi-scale Architecture for Generative Flows

no code implementations • 5 Aug 2019 • Hari Prasanna Das, Pieter Abbeel, Costas J. Spanos

Deep generative modeling using flows has gained popularity owing to the tractable exact log-likelihood estimation with efficient training and synthesis process.

Dimensionality Reduction

Paper
Add Code

Learning Contact-Rich Manipulation Skills with Guided Policy Search

no code implementations • 22 Jan 2015 • Sergey Levine, Nolan Wagener, Pieter Abbeel

Autonomous learning of object manipulation skills can enable robots to acquire rich behavioral repertoires that scale to the variety of objects found in the real world.

Robotics

Paper
Add Code

Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control

no code implementations • 30 Oct 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

We show that CPVs can be learned within a one-shot imitation learning framework without any additional supervision or information about task hierarchy, and enable a demonstration-conditioned policy to generalize to tasks that sequence twice as many skills as the tasks seen during training.

Imitation Learning

Paper
Add Code

Learning Efficient Representation for Intrinsic Motivation

no code implementations • 4 Dec 2019 • Ruihan Zhao, Stas Tiomkin, Pieter Abbeel

The core idea is to represent the relation between action sequences and future states using a stochastic dynamic model in latent space with a specific form.

Paper
Add Code

AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos

no code implementations • 10 Dec 2019 • Laura Smith, Nikita Dhawan, Marvin Zhang, Pieter Abbeel, Sergey Levine

In this paper, we study how these challenges can be alleviated with an automated robotic learning framework, in which multi-stage tasks are defined simply by providing videos of a human demonstrator and then learned autonomously by the robot from raw image observations.

Reinforcement Learning (RL) Translation

Paper
Add Code

Natural Image Manipulation for Autoregressive Models Using Fisher Scores

no code implementations • 25 Nov 2019 • Wilson Yan, Jonathan Ho, Pieter Abbeel

Deep autoregressive models are one of the most powerful models that exist today which achieve state-of-the-art bits per dim.

Image Manipulation

Paper
Add Code

Establishing Appropriate Trust via Critical States

no code implementations • 18 Oct 2018 • Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan

In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts.

Robotics

Paper
Add Code

Hierarchical Variational Imitation Learning of Control Programs

1 code implementation • 29 Dec 2019 • Roy Fox, Richard Shin, William Paul, Yitian Zou, Dawn Song, Ken Goldberg, Pieter Abbeel, Ion Stoica

Autonomous agents can learn by imitating teacher demonstrations of the intended behavior.

Imitation Learning Variational Inference

Paper
Code

Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards

no code implementations • 21 Dec 2019 • Xingyu Lu, Stas Tiomkin, Pieter Abbeel

While recent progress in deep reinforcement learning has enabled robots to learn complex behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Preventing Imitation Learning with Adversarial Policy Ensembles

no code implementations • 31 Jan 2020 • Albert Zhan, Stas Tiomkin, Pieter Abbeel

To our knowledge, this is the first work regarding the protection of policies in Reinforcement Learning.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning

no code implementations • 5 Feb 2020 • Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu

In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction

no code implementations • 17 Feb 2020 • Kourosh Hakhamaneshi, Keertana Settaluri, Pieter Abbeel, Vladimir Stojanovic

In this work we present a new method of black-box optimization and constraint satisfaction.

Policy Gradient Methods

Paper
Add Code

The Limits and Potentials of Deep Learning for Robotics

no code implementations • 18 Apr 2018 • Niko Sünderhauf, Oliver Brock, Walter Scheirer, Raia Hadsell, Dieter Fox, Jürgen Leitner, Ben Upcroft, Pieter Abbeel, Wolfram Burgard, Michael Milford, Peter Corke

In this paper we discuss a number of robotics-specific learning, reasoning, and embodiment challenges for deep learning.

Robotics

Paper
Add Code

Generalized Hindsight for Reinforcement Learning

no code implementations • NeurIPS 2020 • Alexander C. Li, Lerrel Pinto, Pieter Abbeel

Compared to standard relabeling techniques, Generalized Hindsight provides a substantially more efficient reuse of samples, which we empirically demonstrate on a suite of multi-task navigation and manipulation tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Mutual Information Maximization for Robust Plannable Representations

no code implementations • 16 May 2020 • Yiming Ding, Ignasi Clavera, Pieter Abbeel

The later, while they present low sample complexity, they learn latent spaces that need to reconstruct every single detail of the scene.

Model-based Reinforcement Learning

Paper
Add Code

Model-Augmented Actor-Critic: Backpropagating through Paths

no code implementations • ICLR 2020 • Ignasi Clavera, Violet Fu, Pieter Abbeel

Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator to augment the data for policy optimization or value function learning.

Model-based Reinforcement Learning

Paper
Add Code

Responsive Safety in Reinforcement Learning by PID Lagrangian Methods

no code implementations • 8 Jul 2020 • Adam Stooke, Joshua Achiam, Pieter Abbeel

Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Efficient Empowerment Estimation for Unsupervised Stabilization

no code implementations • ICLR 2021 • Ruihan Zhao, Kevin Lu, Pieter Abbeel, Stas Tiomkin

We demonstrate our solution for sample-based unsupervised stabilization on different dynamical control systems and show the advantages of our method by comparing it to the existing VLB approaches.

Paper
Add Code

Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning

no code implementations • 3 Aug 2020 • Xingyu Lu, Kimin Lee, Pieter Abbeel, Stas Tiomkin

Despite the significant progress of deep reinforcement learning (RL) in solving sequential decision making problems, RL agents often overfit to training environments and struggle to adapt to new, unseen environments.

Decision Making reinforcement-learning +1

Paper
Add Code

Visual Imitation Made Easy

no code implementations • 11 Aug 2020 • Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto

We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.

Imitation Learning

Paper
Add Code

Hierarchically Decoupled Morphological Transfer

no code implementations • ICML 2020 • Donald Hejna, Lerrel Pinto, Pieter Abbeel

Learning long-range behaviors on complex high-dimensional agents is a fundamental problem in robot learning.

Paper
Add Code

Responsive Safety in Reinforcement Learning

no code implementations • ICML 2020 • Adam Stooke, Joshua Achiam, Pieter Abbeel

This intuition leads to our introduction of PID control for the Lagrange multiplier in constrained RL, which we cast as a dynamical system.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Planning to Explore via Latent Disagreement

no code implementations • ICML 2020 • Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak

To solve complex tasks, intelligent agents first need to explore their environments.

Paper
Add Code

VideoGen: Generative Modeling of Videos using VQ-VAE and Transformers

no code implementations • 1 Jan 2021 • Yunzhi Zhang, Wilson Yan, Pieter Abbeel, Aravind Srinivas

We present VideoGen: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.

Position Video Generation

Paper
Add Code

Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets

no code implementations • 1 Jan 2021 • SeungHyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin

As it turns out, fine-tuning offline RL agents is a non-trivial challenge, due to distribution shift – the agent encounters out-of-distribution samples during online interaction, which may cause bootstrapping error in Q-learning and instability during fine-tuning.

D4RL Offline RL +3

Paper
Add Code

Unsupervised Active Pre-Training for Reinforcement Learning

no code implementations • 1 Jan 2021 • Hao liu, Pieter Abbeel

On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult for training from scratch.

Atari Games Contrastive Learning +3

Paper
Add Code

Robust Imitation via Decision-Time Planning

no code implementations • 1 Jan 2021 • Carl Qi, Pieter Abbeel, Aditya Grover

The goal of imitation learning is to mimic expert behavior from demonstrations, without access to an explicit reward signal.

Imitation Learning reinforcement-learning +2

Paper
Add Code

R-LAtte: Attention Module for Visual Control via Reinforcement Learning

no code implementations • 1 Jan 2021 • Mandi Zhao, Qiyang Li, Aravind Srinivas, Ignasi Clavera, Kimin Lee, Pieter Abbeel

Attention mechanisms are generic inductive biases that have played a critical role in improving the state-of-the-art in supervised learning, unsupervised pre-training and generative modeling for multiple domains including vision, language and speech.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates

no code implementations • 1 Jan 2021 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel

Furthermore, since our weighted Bellman backups rely on maintaining an ensemble, we investigate how weighted Bellman backups interact with other benefits previously derived from ensembles: (a) Bootstrap; (b) UCB Exploration.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Benefits of Assistance over Reward Learning

no code implementations • 1 Jan 2021 • Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart Russell

By merging reward learning and control, assistive agents can reason about the impact of control actions on reward learning, leading to several advantages over agents based on reward learning.

Paper
Add Code

Compute- and Memory-Efficient Reinforcement Learning with Latent Experience Replay

no code implementations • 1 Jan 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel

In this paper, we present Latent Vector Experience Replay (LeVER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements without sacrificing the performance of RL agents.

Atari Games reinforcement-learning +2

Paper
Add Code

Discrete Predictive Representation for Long-horizon Planning

no code implementations • 1 Jan 2021 • Thanard Kurutach, Julia Peng, Yang Gao, Stuart Russell, Pieter Abbeel

Discrete representations have been key in enabling robots to plan at more abstract levels and solve temporally-extended tasks more efficiently for decades.

Object Reinforcement Learning (RL)

Paper
Add Code

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

no code implementations • 14 Dec 2020 • Albert Zhan, Ruihan Zhao, Lerrel Pinto, Pieter Abbeel, Michael Laskin

We present Contrastive Pre-training and Data Augmentation for Efficient Robotic Learning (CoDER), a method that utilizes data augmentation and unsupervised learning to achieve sample-efficient training of real-robot arm policies from sparse rewards.

Data Augmentation reinforcement-learning +2

Paper
Add Code

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

no code implementations • 26 Mar 2021 • Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Developing robust walking controllers for bipedal robots is a challenging endeavor.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

GEM: Group Enhanced Model for Learning Dynamical Control Systems

no code implementations • 7 Apr 2021 • Philippe Hansen-Estruch, Wenling Shang, Lerrel Pinto, Pieter Abbeel, Stas Tiomkin

In this work, we take advantage of these structures to build effective dynamical models that are amenable to sample-based learning.

Continuous Control Model-based Reinforcement Learning

Paper
Add Code

Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL

no code implementations • 16 Jun 2021 • Catherine Cang, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

When combined together, they substantially improve the performance and generalization of offline RL policies.

D4RL Domain Generalization +2

Paper
Add Code

Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments

no code implementations • 18 Jun 2021 • Abdus Salam Azad, Edward Kim, Qiancheng Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Sanjit A. Seshia

To showcase the benefits, we interfaced SCENIC to an existing RTS environment Google Research Football(GRF) simulator and introduced a benchmark consisting of 32 realistic scenarios, encoded in SCENIC, to train RL agents and testing their generalization capabilities.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations • 5 Jul 2021 • Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning

Paper
Add Code

Playful Interactions for Representation Learning

no code implementations • 19 Jul 2021 • Sarah Young, Jyothish Pari, Pieter Abbeel, Lerrel Pinto

In this work, we propose to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks.

Imitation Learning Representation Learning

Paper
Add Code

Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback

no code implementations • 11 Aug 2021 • Xiaofei Wang, Kimin Lee, Kourosh Hakhamaneshi, Pieter Abbeel, Michael Laskin

A promising approach to solving challenging long-horizon tasks has been to extract behavior priors (skills) by fitting generative models to large offline datasets of demonstrations.

Paper
Add Code

APS: Active Pretraining with Successor Features

no code implementations • 31 Aug 2021 • Hao liu, Pieter Abbeel

We introduce a new unsupervised pretraining objective for reinforcement learning.

Ranked #5 on Unsupervised Reinforcement Learning on URLB (states, 5*10^5 frames)

Unsupervised Reinforcement Learning

Paper
Add Code

Improving Long-Horizon Imitation Through Language Prediction

no code implementations • 29 Sep 2021 • Donald Joseph Hejna III, Pieter Abbeel, Lerrel Pinto

Complex, long-horizon planning and its combinatorial nature pose steep challenges for learning-based agents.

Paper
Add Code

Pretraining for Language Conditioned Imitation with Transformers

no code implementations • 29 Sep 2021 • Aaron L Putterman, Kevin Lu, Igor Mordatch, Pieter Abbeel

We study reinforcement learning (RL) agents which can utilize language inputs.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Autoregressive Latent Video Prediction with High-Fidelity Image Generator

no code implementations • 29 Sep 2021 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel

Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.

Data Augmentation Video Prediction +1

Paper
Add Code

Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers

no code implementations • 29 Sep 2021 • Catherine Cang, Kourosh Hakhamaneshi, Ryan Rudes, Igor Mordatch, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

In this paper, we investigate how we can leverage large reward-free (i. e. task-agnostic) offline datasets of prior interactions to pre-train agents that can then be fine-tuned using a small reward-annotated dataset.

D4RL Offline RL +2

Paper
Add Code

Towards More Generalizable One-shot Visual Imitation Learning

no code implementations • 26 Oct 2021 • Zhao Mandi, Fangchen Liu, Kimin Lee, Pieter Abbeel

We then study the multi-task setting, where multi-task training is followed by (i) one-shot imitation on variations within the training tasks, (ii) one-shot imitation on new tasks, and (iii) fine-tuning on new tasks.

Contrastive Learning Imitation Learning +2

Paper
Add Code

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

no code implementations • 28 Oct 2021 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

Under the belief that $\beta$ is closely related to the (state dependent) model uncertainty, Entropy Regularized Q-Learning (EQL) further introduces a principled scheduling of $\beta$ by maintaining a collection of the model parameters that characterizes model uncertainty.

Q-Learning Scheduling

Paper
Add Code

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

no code implementations • 4 Nov 2021 • Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak

We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size.

Multi-Task Learning Object +2

Paper
Add Code

PatchFormer: A neural architecture for self-supervised representation learning on images

no code implementations • 25 Sep 2019 • Aravind Srinivas, Pieter Abbeel

In this paper, we propose a neural architecture for self-supervised representation learning on raw images called the PatchFormer which learns to model spatial dependencies across patches in a raw image.

Representation Learning Self-Supervised Learning

Paper
Add Code

Dynamical System Embedding for Efficient Intrinsically Motivated Artificial Agents

no code implementations • 25 Sep 2019 • Ruihan Zhao, Stas Tiomkin, Pieter Abbeel

In this work, we develop a novel approach for the estimation of empowerment in unknown arbitrary dynamics from visual stimulus only, without sampling for the estimation of MIAS.

Paper
Add Code

Data-Efficient Exploration with Self Play for Atari

no code implementations • ICML Workshop URL 2021 • Michael Laskin, Catherine Cang, Ryan Rudes, Pieter Abbeel

To alleviate the reliance on reward engineering it is important to develop RL algorithms capable of efficiently acquiring skills with no rewards extrinsic to the agent.

Efficient Exploration Reinforcement Learning (RL)

Paper
Add Code

Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

no code implementations • 28 Nov 2021 • Dailin Hu, Pieter Abbeel, Roy Fox

Maximum Entropy Reinforcement Learning (MaxEnt RL) algorithms such as Soft Q-Learning (SQL) and Soft Actor-Critic trade off reward and policy entropy, which has the potential to improve training stability and robustness.

Q-Learning reinforcement-learning +2

Paper
Add Code

Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL

no code implementations • NeurIPS 2021 • Charles Packer, Pieter Abbeel, Joseph E. Gonzalez

Meta-reinforcement learning (meta-RL) has proven to be a successful framework for leveraging experience from prior tasks to rapidly learn new related tasks, however, current meta-RL approaches struggle to learn in sparse reward environments.

Meta Reinforcement Learning

Paper
Add Code

Target Entropy Annealing for Discrete Soft Actor-Critic

no code implementations • 6 Dec 2021 • Yaosheng Xu, Dailin Hu, Litian Liang, Stephen Mcaleer, Pieter Abbeel, Roy Fox

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in continuous action space settings.

Atari Games Scheduling

Paper
Add Code

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

no code implementations • 22 Feb 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover

Training such agents efficiently requires automatic generation of a goal curriculum.

Paper
Add Code

SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning

no code implementations • ICLR 2022 • Jongjin Park, Younggyo Seo, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.

Data Augmentation Reinforcement Learning (RL)

Paper
Add Code

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

no code implementations • 28 Mar 2022 • Alejandro Escontrela, Xue Bin Peng, Wenhao Yu, Tingnan Zhang, Atil Iscen, Ken Goldberg, Pieter Abbeel

We also demonstrate that an effective style reward can be learned from a few seconds of motion capture data gathered from a German Shepherd and leads to energy-efficient locomotion strategies with natural gait transitions.

Paper
Add Code

Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning

no code implementations • 7 Apr 2022 • Carl Qi, Pieter Abbeel, Aditya Grover

The goal of imitation learning is to mimic expert behavior from demonstrations, without access to an explicit reward signal.

Imitation Learning reinforcement-learning +2

Paper
Add Code

Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking

no code implementations • 14 Apr 2022 • Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-hui Liu, Pieter Abbeel, Qi Dou

To continuously improve the quality of pseudo labels, we iterate the above steps by taking the trained student model as a new teacher and re-label real data using the refined teacher model.

6D Pose Estimation using RGB Robotic Grasping

Paper
Add Code

Safe Exploration in Markov Decision Processes

no code implementations • 22 May 2012 • Teodor Mihai Moldovan, Pieter Abbeel

We show that imposing safety by restricting attention to the resulting set of guaranteed safe policies is NP-hard.

Safe Exploration

Paper
Add Code

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

no code implementations • 7 Jun 2022 • Zhao Mandi, Pieter Abbeel, Stephen James

From these findings, we advocate for evaluating future meta-RL methods on more challenging tasks and including multi-task pretraining with fine-tuning as a simple, yet strong baseline.

Meta-Learning Meta Reinforcement Learning +4

Paper
Add Code

Masked World Models for Visual Control

no code implementations • 28 Jun 2022 • Younggyo Seo, Danijar Hafner, Hao liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel

Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects.

Model-based Reinforcement Learning Reinforcement Learning (RL) +1

Paper
Add Code

Multi-Objective Policy Gradients with Topological Constraints

no code implementations • 15 Sep 2022 • Kyle Hollins Wray, Stas Tiomkin, Mykel J. Kochenderfer, Pieter Abbeel

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety.

Paper
Add Code

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

no code implementations • 15 Sep 2022 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel

Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.

Data Augmentation Video Prediction +1

Paper
Add Code

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction

no code implementations • 13 Oct 2022 • Yuxuan Liu, Nikhil Mishra, Maximilian Sieb, Yide Shentu, Pieter Abbeel, Xi Chen

3D bounding boxes are a widespread intermediate representation in many computer vision applications.

Paper
Add Code

Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions

no code implementations • 23 Oct 2022 • Weirui Ye, Pieter Abbeel, Yang Gao

This paper proposes the Virtual MCTS (V-MCTS), a variant of MCTS that spends more search time on harder states and less search time on simpler states adaptively.

Atari Games Board Games

Paper
Add Code

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations • 24 Oct 2022 • Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Paper
Add Code

StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images via Back-View NOCS

no code implementations • 3 Nov 2022 • Kai Chen, Stephen James, Congying Sui, Yun-hui Liu, Pieter Abbeel, Qi Dou

To further improve the performance of the stereo framework, StereoPose is equipped with a parallax attention module for stereo feature fusion and an epipolar loss for improving the stereo-view consistency of network predictions.

Object Pose Estimation +1

Paper
Add Code

VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models

no code implementations • CVPR 2023 • Ajay Jain, Amber Xie, Pieter Abbeel

We show that a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics.

Image Generation Text to 3D +1

Paper
Add Code

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

no code implementations • 23 Nov 2022 • David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch, Ofir Nachum

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications.

Decision Making

Paper
Add Code

Robust and Versatile Bipedal Jumping Control through Reinforcement Learning

no code implementations • 19 Feb 2023 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Aligning Text-to-Image Models using Human Feedback

no code implementations • 23 Feb 2023 • Kimin Lee, Hao liu, MoonKyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.

Image Generation

Paper
Add Code

Foundation Models for Decision Making: Problems, Methods, and Opportunities

no code implementations • 7 Mar 2023 • Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, Dale Schuurmans

In response to these developments, new paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.

Autonomous Driving Decision Making +1

Paper
Add Code

Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN

no code implementations • 3 May 2023 • Yuxuan Liu, Nikhil Mishra, Pieter Abbeel, Xi Chen

Existing state-of-the-art methods are often unable to capture meaningful uncertainty in challenging or ambiguous scenes, and as such can cause critical errors in high-performance applications.

Instance Segmentation Object Recognition +2

Paper
Add Code

Self-Supervised Instance Segmentation by Grasping

no code implementations • 10 May 2023 • Yuxuan Liu, Xi Chen, Pieter Abbeel

Leveraging this insight, we learn a grasp segmentation model to segment the grasped object from before and after grasp images.

Instance Segmentation Robotic Grasping +2

Paper
Add Code

Emergent Agentic Transformer from Chain of Hindsight Experience

no code implementations • 26 May 2023 • Hao liu, Pieter Abbeel

Our method consists of relabelling target return of each trajectory to the maximum total reward among in sequence of trajectories and training an autoregressive model to predict actions conditioning on past states, actions, rewards, target returns, and task completion tokens, the resulting model, Agentic Transformer (AT), can learn to improve upon itself both at training and test time.

D4RL Imitation Learning +2

Paper
Add Code

Probabilistic Adaptation of Text-to-Video Models

no code implementations • 2 Jun 2023 • Mengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel

Large text-to-video models trained on internet-scale data have demonstrated exceptional capabilities in generating high-fidelity videos from arbitrary textual descriptions.

Language Modelling Large Language Model

Paper
Add Code

ALP: Action-Aware Embodied Learning for Perception

no code implementations • 16 Jun 2023 • Xinran Liang, Anthony Han, Wilson Yan, aditi raghunathan, Pieter Abbeel

In addition, we show that by training on actively collected data more relevant to the environment and task, our method generalizes more robustly to downstream tasks compared to models pre-trained on fixed datasets such as ImageNet.

Benchmarking object-detection +3

Paper
Add Code

SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks

no code implementations • 7 Jul 2023 • Xingyu Lin, John So, Sashwat Mahalingam, Fangchen Liu, Pieter Abbeel

In this work, we present a focused study of the generalization capabilities of the pre-trained visual representations at the categorical level.

Imitation Learning

Paper
Add Code

Learning to Model the World with Language

no code implementations • 31 Jul 2023 • Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan

To interact with humans in the world, agents need to understand the diverse types of language that people use, relate them to the visual world, and act based on them.

Future prediction General Knowledge +1

Paper
Add Code

Language-Conditioned Path Planning

no code implementations • 31 Aug 2023 • Amber Xie, Youngwoon Lee, Pieter Abbeel, Stephen James

Contact is at the core of robotic manipulation.

Paper
Add Code

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

no code implementations • 25 Sep 2023 • Jiangliu Wang, Jianbo Jiao, Yibing Song, Stephen James, Zhan Tong, Chongjian Ge, Pieter Abbeel, Yun-hui Liu

This work aims to improve unsupervised audio-visual pre-training.

Contrastive Learning Data Augmentation

Paper
Add Code

Foundation Reinforcement Learning: towards Embodied Generalist Agents with Foundation Prior Assistance

no code implementations • 4 Oct 2023 • Weirui Ye, Yunsheng Zhang, Mengchen Wang, Shengjie Wang, Xianfan Gu, Pieter Abbeel, Yang Gao

Our method tolerates the unavoidable noise in embodied foundation models.

Quantization reinforcement-learning

Paper
Add Code

Learning Interactive Real-World Simulators

no code implementations • 9 Oct 2023 • Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel

Applications of a real-world simulator range from controllable content creation in games and movies, to training embodied agents purely in simulation that can be directly deployed in the real world.

Video Captioning

Paper
Add Code

Exploration with Principles for Diverse AI Supervision

no code implementations • 13 Oct 2023 • Hao liu, Matei Zaharia, Pieter Abbeel

Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.

Reinforcement Learning (RL) Unsupervised Reinforcement Learning

Paper
Add Code

Interactive Task Planning with Language Models

no code implementations • 16 Oct 2023 • Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution.

Language Modelling Large Language Model +1

Paper
Add Code

Video Language Planning

no code implementations • 16 Oct 2023 • Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

We are interested in enabling visual planning for complex long-horizon tasks in the space of generated videos and language, leveraging recent advances in large generative models pretrained on Internet-scale data.

Paper
Add Code

Managing AI Risks in an Era of Rapid Progress

no code implementations • 26 Oct 2023 • Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila Mcilraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

In this short consensus paper, we outline risks from upcoming, advanced AI systems.

Paper
Add Code

DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing

no code implementations • 2 Nov 2023 • Vint Lee, Pieter Abbeel, Youngwoon Lee

Model-based reinforcement learning (MBRL) has gained much attention for its ability to learn complex behaviors in a sample-efficient way: planning actions by generating imaginary trajectories with predicted rewards.

Model-based Reinforcement Learning reinforcement-learning

Paper
Add Code

The Power of the Senses: Generalizable Manipulation from Vision and Touch through Masked Multimodal Learning

no code implementations • 2 Nov 2023 • Carmelo Sferrazza, Younggyo Seo, Hao liu, Youngwoon Lee, Pieter Abbeel

For tasks requiring object manipulation, we seamlessly and effectively exploit the complementarity of our senses of vision and touch.

Paper
Add Code

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

no code implementations • 2 Nov 2023 • Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell

Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset.

Instruction Following

Paper
Add Code

Scalable Diffusion for Materials Generation

no code implementations • 18 Oct 2023 • Mengjiao Yang, KwangHwan Cho, Amil Merchant, Pieter Abbeel, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

Lastly, we show that conditional generation with UniMat can scale to previously established crystal datasets with up to millions of crystals structures, outperforming random structure search (the current leading method for structure discovery) in discovering new stable materials.

Formation Energy

Paper
Add Code

Motion-Conditioned Image Animation for Video Editing

no code implementations • 30 Nov 2023 • Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi

We introduce MoCA, a Motion-Conditioned Image Animation approach for video editing.

Image Animation Video Editing

Paper
Add Code

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

no code implementations • NeurIPS 2023 • Arjun Majumdar, Karmesh Yadav, Sergio Arnaud, Yecheng Jason Ma, Claire Chen, Sneha Silwal, Aryan Jain, Vincent-Pierre Berges, Pieter Abbeel, Jitendra Malik, Dhruv Batra, Yixin Lin, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier

Contrary to inferences from prior work, we find that scaling dataset size and diversity does not improve performance universally (but does so on average).

Paper
Add Code

Learning a Diffusion Model Policy from Rewards via Q-Score Matching

no code implementations • 18 Dec 2023 • Michael Psenka, Alejandro Escontrela, Pieter Abbeel, Yi Ma

Diffusion models have become a popular choice for representing actor policies in behavior cloning and offline reinforcement learning.

reinforcement-learning

Paper
Add Code

Any-point Trajectory Modeling for Policy Learning

no code implementations • 28 Dec 2023 • Chuan Wen, Xingyu Lin, John So, Kai Chen, Qi Dou, Yang Gao, Pieter Abbeel

Learning from demonstration is a powerful method for teaching robots new skills, and having more demonstration data often improves policy learning.

Trajectory Modeling Transfer Learning

Paper
Add Code

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

no code implementations • 8 Jan 2024 • Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

This kind of data-driven optimization (DDO) presents a range of challenges beyond those in standard prediction problems, since we need models that successfully predict the performance of new designs that are better than the best designs seen in the training set.

Paper
Add Code

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

no code implementations • 30 Jan 2024 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Video as the New Language for Real-World Decision Making

no code implementations • 27 Feb 2024 • Sherry Yang, Jacob Walker, Jack Parker-Holder, Yilun Du, Jake Bruce, Andre Barreto, Pieter Abbeel, Dale Schuurmans

Moreover, we demonstrate how, like language models, video generation can serve as planners, agents, compute engines, and environment simulators through techniques such as in-context learning, planning and reinforcement learning.

Decision Making In-Context Learning +2

Paper
Add Code

MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting

no code implementations • 5 Mar 2024 • Fangchen Liu, Kuan Fang, Pieter Abbeel, Sergey Levine

In this paper, we present MOKA (Marking Open-vocabulary Keypoint Affordances), an approach that employs VLMs to solve robotic manipulation tasks specified by free-form language descriptions.

In-Context Learning Question Answering +2

Paper
Add Code

Twisting Lids Off with Two Hands

no code implementations • 4 Mar 2024 • Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, Jitendra Malik

Manipulating objects with two multi-fingered hands has been a long-standing challenge in robotics, attributed to the contact-rich nature of many manipulation tasks and the complexity inherent in coordinating a high-dimensional bimanual system.

reinforcement-learning

Paper
Add Code

HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation

no code implementations • 15 Mar 2024 • Carmelo Sferrazza, Dun-Ming Huang, Xingyu Lin, Youngwoon Lee, Pieter Abbeel

Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology.

Paper
Add Code

Adversarial Attacks on Neural Network Policies

1 code implementation • 8 Feb 2017 • Sandy Huang, Nicolas Papernot, Ian Goodfellow, Yan Duan, Pieter Abbeel

Machine learning classifiers are known to be vulnerable to inputs maliciously constructed by adversaries to force misclassification.

Paper
Code

JUMBO: Scalable Multi-task Bayesian Optimization using Offline Data

1 code implementation • 2 Jun 2021 • Kourosh Hakhamaneshi, Pieter Abbeel, Vladimir Stojanovic, Aditya Grover

Such a decomposition can dynamically control the reliability of information derived from the online and offline data and the use of pretrained neural networks permits scalability to large offline datasets.

Bayesian Optimization Gaussian Processes

Paper
Code

Automatic Goal Generation for Reinforcement Learning Agents

1 code implementation • ICML 2018 • Carlos Florensa, David Held, Xinyang Geng, Pieter Abbeel

Instead, we propose a method that allows an agent to automatically discover the range of tasks that it is capable of performing.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

1 code implementation • 3 Jul 2015 • Bradly C. Stadie, Sergey Levine, Pieter Abbeel

By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces.

Ranked #24 on Atari Games on Atari 2600 Q*Bert

Atari Games reinforcement-learning +2

Paper
Code

Convolutional Occupancy Models for Dense Packing of Complex, Novel Objects

1 code implementation • 31 Jul 2023 • Nikhil Mishra, Pieter Abbeel, Xi Chen, Maximilian Sieb

Dense packing in pick-and-place systems is an important feature in many warehouse and logistics applications.

Paper
Code

Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs

1 code implementation • 7 Mar 2024 • Nikhil Mishra, Maximilian Sieb, Pieter Abbeel, Xi Chen

Deep learning methods for perception are the cornerstone of many robotic systems.

Paper
Code

Gradient Estimation Using Stochastic Computation Graphs

1 code implementation • NeurIPS 2015 • John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external world.

Paper
Code

AvE: Assistance via Empowerment

1 code implementation • NeurIPS 2020 • Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan

One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person's goal(s).

Paper
Code

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

1 code implementation • 19 Oct 2022 • Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica

Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks.

Reinforcement Learning (RL) Representation Learning +1

Paper
Code

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

1 code implementation • 29 Jan 2022 • Julius Frost, Olivia Watkins, Eric Weiner, Pieter Abbeel, Trevor Darrell, Bryan Plummer, Kate Saenko

In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time.

counterfactual Decision Making +2

Paper
Code

Improving Long-Horizon Imitation Through Instruction Prediction

1 code implementation • 21 Jun 2023 • Joey Hejna, Pieter Abbeel, Lerrel Pinto

Complex, long-horizon planning and its combinatorial nature pose steep challenges for learning-based agents.

Paper
Code

Guiding Policies with Language via Meta-Learning

1 code implementation • ICLR 2019 • John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.

Imitation Learning Instruction Following +1

Paper
Code

Variable Skipping for Autoregressive Range Density Estimation

1 code implementation • ICML 2020 • Eric Liang, Zongheng Yang, Ion Stoica, Pieter Abbeel, Yan Duan, Xi Chen

In this paper, we explore a technique, variable skipping, for accelerating range density estimation over deep autoregressive models.

Data Augmentation Density Estimation

Paper
Code

Emergence of Grounded Compositional Language in Multi-Agent Populations

1 code implementation • 15 Mar 2017 • Igor Mordatch, Pieter Abbeel

By capturing statistical patterns in large corpora, machine learning has enabled significant advances in natural language processing, including in machine translation, question answering, and sentiment analysis.

Machine Translation Question Answering +2

Paper
Code

Cooperative Inverse Reinforcement Learning

2 code implementations • NeurIPS 2016 • Dylan Hadfield-Menell, Anca Dragan, Pieter Abbeel, Stuart Russell

For an autonomous system to be helpful to humans and to pose no unwarranted risks, it needs to align its values with those of the humans in its environment in such a way that its actions contribute to the maximization of value for the humans.

Active Learning reinforcement-learning +1

Paper
Code

Deep Spatial Autoencoders for Visuomotor Learning

1 code implementation • 21 Sep 2015 • Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel

Our method uses a deep spatial autoencoder to acquire a set of feature points that describe the environment for the current task, such as the positions of objects, and then learns a motion skill with these feature points using an efficient reinforcement learning method based on local linear models.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Compositional Plan Vectors

1 code implementation • NeurIPS 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

Imitation Learning

Paper
Code

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks

1 code implementation • 16 Sep 2022 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

On a set of 26 benchmark Atari environments, MeanQ outperforms all tested baselines, including the best available baseline, SUNRISE, at 100K interaction steps in 16/26 environments, and by 68% on average.

Paper
Code

Backprop KF: Learning Discriminative Deterministic State Estimators

1 code implementation • NeurIPS 2016 • Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.

Autonomous Vehicles Visual Odometry

Paper
Code

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

2 code implementations • ICLR 2022 • Xinran Liang, Katherine Shu, Kimin Lee, Pieter Abbeel

Our intuition is that disagreement in learned reward model reflects uncertainty in tailored human feedback and could be useful for exploration.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

1 code implementation • NeurIPS 2019 • Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine

In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.

Continuous Control

Paper
Code

Benchmarking Model-Based Reinforcement Learning

2 code implementations • 3 Jul 2019 • Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba

Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL.

Benchmarking Model-based Reinforcement Learning +3

Paper
Code

Hierarchical Few-Shot Imitation with Skill Transition Models

1 code implementation • ICML Workshop URL 2021 • Kourosh Hakhamaneshi, Ruihan Zhao, Albert Zhan, Pieter Abbeel, Michael Laskin

To this end, we present Few-shot Imitation with Skill Transition Models (FIST), an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks given a few downstream demonstrations.

Paper
Code

Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design

1 code implementation • 29 Mar 2022 • Kourosh Hakhamaneshi, Marcel Nassar, Mariano Phielipp, Pieter Abbeel, Vladimir Stojanović

We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties with up to 10x more sample efficiency compared to a randomly initialized model.

Paper
Code

Patch-based Object-centric Transformers for Efficient Video Generation

1 code implementation • 8 Jun 2022 • Wilson Yan, Ryo Okumura, Stephen James, Pieter Abbeel

In this work, we present Patch-based Object-centric Video Transformer (POVT), a novel region-based video generation architecture that leverages object-centric information to efficiently model temporal dynamics in videos.

Object Video Editing +2

Paper
Code

Geometry-Aware Neural Rendering

1 code implementation • NeurIPS 2019 • Josh Tobin, OpenAI Robotics, Pieter Abbeel

Understanding the 3-dimensional structure of the world is a core challenge in computer vision and robotics.

Neural Rendering

Paper
Code

Hallucinative Topological Memory for Zero-Shot Visual Planning

1 code implementation • ICML 2020 • Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar

In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e. g., images obtained from self-supervised robot interaction.

Paper
Code

Asynchronous Methods for Model-Based Reinforcement Learning

1 code implementation • 28 Oct 2019 • Yunzhi Zhang, Ignasi Clavera, Boren Tsai, Pieter Abbeel

In this work, we propose an asynchronous framework for model-based reinforcement learning methods that brings down the run time of these algorithms to be just the data collection time.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

It Takes Four to Tango: Multiagent Self Play for Automatic Curriculum Generation

1 code implementation • ICLR 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover

We are interested in training general-purpose reinforcement learning agents that can solve a wide variety of goals.

Paper
Code

Adaptive Online Planning for Continual Lifelong Learning

1 code implementation • 3 Dec 2019 • Kevin Lu, Igor Mordatch, Pieter Abbeel

We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change.

Paper
Code

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

3 code implementations • 12 Oct 2017 • Tianhao Zhang, Zoe McCarthy, Owen Jow, Dennis Lee, Xi Chen, Ken Goldberg, Pieter Abbeel

Imitation learning is a powerful paradigm for robot skill acquisition.

Imitation Learning

Paper
Code

Hierarchically Decoupled Imitation for Morphological Transfer

1 code implementation • 3 Mar 2020 • Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto

Learning long-range behaviors on complex high-dimensional agents is a fundamental problem in robot learning.

Paper
Code

Teachable Reinforcement Learning via Advice Distillation

1 code implementation • NeurIPS 2021 • Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek Gupta

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention.

Imitation Learning reinforcement-learning +1

Paper
Code

Synkhronos: a Multi-GPU Theano Extension for Data Parallelism

1 code implementation • 11 Oct 2017 • Adam Stooke, Pieter Abbeel

We present Synkhronos, an extension to Theano for multi-GPU computations leveraging data parallelism.

Paper
Code

Deep Object-Centric Representations for Generalizable Robot Learning

1 code implementation • 14 Aug 2017 • Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine

We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.

Object Reinforcement Learning (RL)

Paper
Code

Task-Agnostic Morphology Evolution

1 code implementation • ICLR 2021 • Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto

Deep reinforcement learning primarily focuses on learning behavior, usually overlooking the fact that an agent's function is largely determined by form.

Paper
Code

AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation

1 code implementation • NeurIPS 2023 • Daiki E. Matsunaga, Jongmin Lee, Jaeseok Yoon, Stefanos Leonardos, Pieter Abbeel, Kee-Eung Kim

To this end, we introduce AlberDICE, an offline MARL algorithm that alternatively performs centralized training of individual agents based on stationary distribution optimization.

Reinforcement Learning (RL)

Paper
Code

Parallel Training of Deep Networks with Local Updates

1 code implementation • 7 Dec 2020 • Michael Laskin, Luke Metz, Seth Nabarro, Mark Saroufim, Badreddine Noune, Carlo Luschi, Jascha Sohl-Dickstein, Pieter Abbeel

Deep learning models trained on large data sets have been widely successful in both vision and language domains.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.