Search Results for author: Pieter Abbeel

CURL extracts high level features from raw pixels using a contrastive learning objective and performs off-policy control on top of the extracted features.

Contrastive Learning reinforcement-learning +2

556

Paper
Code

HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation

no code implementations • 15 Mar 2024 • Carmelo Sferrazza, Dun-Ming Huang, Xingyu Lin, Youngwoon Lee, Pieter Abbeel

Humanoid robots hold great promise in assisting humans in diverse environments and tasks, due to their flexibility and adaptability leveraging human-like morphology.

Paper
Add Code

Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs

1 code implementation • 7 Mar 2024 • Nikhil Mishra, Maximilian Sieb, Pieter Abbeel, Xi Chen

Deep learning methods for perception are the cornerstone of many robotic systems.

Paper
Code

MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting

no code implementations • 5 Mar 2024 • Fangchen Liu, Kuan Fang, Pieter Abbeel, Sergey Levine

In this paper, we present MOKA (Marking Open-vocabulary Keypoint Affordances), an approach that employs VLMs to solve robotic manipulation tasks specified by free-form language descriptions.

In-Context Learning Question Answering +2

Paper
Add Code

Twisting Lids Off with Two Hands

no code implementations • 4 Mar 2024 • Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, Jitendra Malik

Manipulating objects with two multi-fingered hands has been a long-standing challenge in robotics, attributed to the contact-rich nature of many manipulation tasks and the complexity inherent in coordinating a high-dimensional bimanual system.

reinforcement-learning

Paper
Add Code

Video as the New Language for Real-World Decision Making

no code implementations • 27 Feb 2024 • Sherry Yang, Jacob Walker, Jack Parker-Holder, Yilun Du, Jake Bruce, Andre Barreto, Pieter Abbeel, Dale Schuurmans

Moreover, we demonstrate how, like language models, video generation can serve as planners, agents, compute engines, and environment simulators through techniques such as in-context learning, planning and reinforcement learning.

Decision Making In-Context Learning +2

Paper
Add Code

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

1 code implementation • 27 Feb 2024 • Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner?

Offline RL reinforcement-learning

Paper
Code

A StrongREJECT for Empty Jailbreaks

1 code implementation • 15 Feb 2024 • Alexandra Souly, Qingyuan Lu, Dillon Bowen, Tu Trinh, Elvis Hsieh, Sana Pandey, Pieter Abbeel, Justin Svegliato, Scott Emmons, Olivia Watkins, Sam Toyer

We show that our new grading scheme better accords with human judgment of response quality and overall jailbreak effectiveness, especially on the sort of low-quality responses that contribute the most to over-estimation of jailbreak performance on existing benchmarks.

Paper
Code

World Model on Million-Length Video And Language With Blockwise RingAttention

1 code implementation • 13 Feb 2024 • Hao liu, Wilson Yan, Matei Zaharia, Pieter Abbeel

To address these challenges, we curate a large dataset of diverse videos and books, utilize the Blockwise RingAttention technique to scalably train on long sequences, and gradually increase context size from 4K to 1M tokens.

4k Video Understanding

6,803

Paper
Code

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

no code implementations • 30 Jan 2024 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

no code implementations • 8 Jan 2024 • Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

This kind of data-driven optimization (DDO) presents a range of challenges beyond those in standard prediction problems, since we need models that successfully predict the performance of new designs that are better than the best designs seen in the training set.

Paper
Add Code

Any-point Trajectory Modeling for Policy Learning

no code implementations • 28 Dec 2023 • Chuan Wen, Xingyu Lin, John So, Kai Chen, Qi Dou, Yang Gao, Pieter Abbeel

Learning from demonstration is a powerful method for teaching robots new skills, and having more demonstration data often improves policy learning.

Trajectory Modeling Transfer Learning

Paper
Add Code

Learning a Diffusion Model Policy from Rewards via Q-Score Matching

no code implementations • 18 Dec 2023 • Michael Psenka, Alejandro Escontrela, Pieter Abbeel, Yi Ma

Diffusion models have become a popular choice for representing actor policies in behavior cloning and offline reinforcement learning.

reinforcement-learning

Paper
Add Code

Motion-Conditioned Image Animation for Video Editing

no code implementations • 30 Nov 2023 • Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi

We introduce MoCA, a Motion-Conditioned Image Animation approach for video editing.

Image Animation Video Editing

Paper
Add Code

AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction Estimation

1 code implementation • NeurIPS 2023 • Daiki E. Matsunaga, Jongmin Lee, Jaeseok Yoon, Stefanos Leonardos, Pieter Abbeel, Kee-Eung Kim

To this end, we introduce AlberDICE, an offline MARL algorithm that alternatively performs centralized training of individual agents based on stationary distribution optimization.

Reinforcement Learning (RL)

Paper
Code

The Power of the Senses: Generalizable Manipulation from Vision and Touch through Masked Multimodal Learning

no code implementations • 2 Nov 2023 • Carmelo Sferrazza, Younggyo Seo, Hao liu, Youngwoon Lee, Pieter Abbeel

For tasks requiring object manipulation, we seamlessly and effectively exploit the complementarity of our senses of vision and touch.

Paper
Add Code

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

no code implementations • 2 Nov 2023 • Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell

Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset.

Instruction Following

Paper
Add Code

DreamSmooth: Improving Model-based Reinforcement Learning via Reward Smoothing

no code implementations • 2 Nov 2023 • Vint Lee, Pieter Abbeel, Youngwoon Lee

Model-based reinforcement learning (MBRL) has gained much attention for its ability to learn complex behaviors in a sample-efficient way: planning actions by generating imaginary trajectories with predicted rewards.

Model-based Reinforcement Learning reinforcement-learning

Paper
Add Code

Managing AI Risks in an Era of Rapid Progress

no code implementations • 26 Oct 2023 • Yoshua Bengio, Geoffrey Hinton, Andrew Yao, Dawn Song, Pieter Abbeel, Yuval Noah Harari, Ya-Qin Zhang, Lan Xue, Shai Shalev-Shwartz, Gillian Hadfield, Jeff Clune, Tegan Maharaj, Frank Hutter, Atılım Güneş Baydin, Sheila Mcilraith, Qiqi Gao, Ashwin Acharya, David Krueger, Anca Dragan, Philip Torr, Stuart Russell, Daniel Kahneman, Jan Brauner, Sören Mindermann

In this short consensus paper, we outline risks from upcoming, advanced AI systems.

Paper
Add Code

Scalable Diffusion for Materials Generation

no code implementations • 18 Oct 2023 • Mengjiao Yang, KwangHwan Cho, Amil Merchant, Pieter Abbeel, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

Lastly, we show that conditional generation with UniMat can scale to previously established crystal datasets with up to millions of crystals structures, outperforming random structure search (the current leading method for structure discovery) in discovering new stable materials.

Formation Energy

Paper
Add Code

Interactive Task Planning with Language Models

no code implementations • 16 Oct 2023 • Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution.

Language Modelling Large Language Model +1

Paper
Add Code

Video Language Planning

no code implementations • 16 Oct 2023 • Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

We are interested in enabling visual planning for complex long-horizon tasks in the space of generated videos and language, leveraging recent advances in large generative models pretrained on Internet-scale data.

Paper
Add Code

Exploration with Principles for Diverse AI Supervision

no code implementations • 13 Oct 2023 • Hao liu, Matei Zaharia, Pieter Abbeel

Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.

Reinforcement Learning (RL) Unsupervised Reinforcement Learning

Paper
Add Code

Learning Interactive Real-World Simulators

no code implementations • 9 Oct 2023 • Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel

Applications of a real-world simulator range from controllable content creation in games and movies, to training embodied agents purely in simulation that can be directly deployed in the real world.

Video Captioning

Paper
Add Code

Foundation Reinforcement Learning: towards Embodied Generalist Agents with Foundation Prior Assistance

no code implementations • 4 Oct 2023 • Weirui Ye, Yunsheng Zhang, Mengchen Wang, Shengjie Wang, Xianfan Gu, Pieter Abbeel, Yang Gao

Our method tolerates the unavoidable noise in embodied foundation models.

Quantization reinforcement-learning

Paper
Add Code

Ring Attention with Blockwise Transformers for Near-Infinite Context

3 code implementations • 3 Oct 2023 • Hao liu, Matei Zaharia, Pieter Abbeel

Transformers have emerged as the architecture of choice for many state-of-the-art AI models, showcasing exceptional performance across a wide range of AI applications.

Language Modelling

495

Paper
Code

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

no code implementations • 25 Sep 2023 • Jiangliu Wang, Jianbo Jiao, Yibing Song, Stephen James, Zhan Tong, Chongjian Ge, Pieter Abbeel, Yun-hui Liu

This work aims to improve unsupervised audio-visual pre-training.

Contrastive Learning Data Augmentation

Paper
Add Code

Language-Conditioned Path Planning

no code implementations • 31 Aug 2023 • Amber Xie, Youngwoon Lee, Pieter Abbeel, Stephen James

Contact is at the core of robotic manipulation.

Paper
Add Code

Language Reward Modulation for Pretraining Reinforcement Learning

1 code implementation • 23 Aug 2023 • Ademi Adeniji, Amber Xie, Carmelo Sferrazza, Younggyo Seo, Stephen James, Pieter Abbeel

Using learned reward functions (LRFs) as a means to solve sparse-reward reinforcement learning (RL) tasks has yielded some steady progress in task-complexity through the years.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Convolutional Occupancy Models for Dense Packing of Complex, Novel Objects

1 code implementation • 31 Jul 2023 • Nikhil Mishra, Pieter Abbeel, Xi Chen, Maximilian Sieb

Dense packing in pick-and-place systems is an important feature in many warehouse and logistics applications.

Paper
Code

Learning to Model the World with Language

no code implementations • 31 Jul 2023 • Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan

To interact with humans in the world, agents need to understand the diverse types of language that people use, relate them to the visual world, and act based on them.

Future prediction General Knowledge +1

Paper
Add Code

SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks

no code implementations • 7 Jul 2023 • Xingyu Lin, John So, Sashwat Mahalingam, Fangchen Liu, Pieter Abbeel

In this work, we present a focused study of the generalization capabilities of the pre-trained visual representations at the categorical level.

Imitation Learning

Paper
Add Code

Improving Long-Horizon Imitation Through Instruction Prediction

1 code implementation • 21 Jun 2023 • Joey Hejna, Pieter Abbeel, Lerrel Pinto

Complex, long-horizon planning and its combinatorial nature pose steep challenges for learning-based agents.

Paper
Code

ALP: Action-Aware Embodied Learning for Perception

no code implementations • 16 Jun 2023 • Xinran Liang, Anthony Han, Wilson Yan, aditi raghunathan, Pieter Abbeel

In addition, we show that by training on actively collected data more relevant to the environment and task, our method generalizes more robustly to downstream tasks compared to models pre-trained on fixed datasets such as ImageNet.

Benchmarking object-detection +3

Paper
Add Code

Probabilistic Adaptation of Text-to-Video Models

no code implementations • 2 Jun 2023 • Mengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel

Large text-to-video models trained on internet-scale data have demonstrated exceptional capabilities in generating high-fidelity videos from arbitrary textual descriptions.

Language Modelling Large Language Model

Paper
Add Code

Train Offline, Test Online: A Real Robot Learning Benchmark

1 code implementation • 1 Jun 2023 • Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta

Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data.

Paper
Code

Blockwise Parallel Transformer for Large Context Models

2 code implementations • 30 May 2023 • Hao liu, Pieter Abbeel

Transformers have emerged as the cornerstone of state-of-the-art natural language processing models, showcasing exceptional performance across a wide range of AI applications.

Language Modelling

495

Paper
Code

Emergent Agentic Transformer from Chain of Hindsight Experience

no code implementations • 26 May 2023 • Hao liu, Pieter Abbeel

Our method consists of relabelling target return of each trajectory to the maximum total reward among in sequence of trajectories and training an autoregressive model to predict actions conditioning on past states, actions, rewards, target returns, and task completion tokens, the resulting model, Agentic Transformer (AT), can learn to improve upon itself both at training and test time.

D4RL Imitation Learning +2

Paper
Add Code

The False Promise of Imitating Proprietary LLMs

1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

1,087

Paper
Code

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

2 code implementations • 25 May 2023 • Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee

We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.

reinforcement-learning Reinforcement Learning (RL)

32,798

Paper
Code

Self-Supervised Instance Segmentation by Grasping

no code implementations • 10 May 2023 • Yuxuan Liu, Xi Chen, Pieter Abbeel

Leveraging this insight, we learn a grasp segmentation model to segment the grasped object from before and after grasp images.

Instance Segmentation Robotic Grasping +2

Paper
Add Code

Masked Trajectory Models for Prediction, Representation, and Control

1 code implementation • 4 May 2023 • Philipp Wu, Arjun Majumdar, Kevin Stone, Yixin Lin, Igor Mordatch, Pieter Abbeel, Aravind Rajeswaran

We introduce Masked Trajectory Models (MTM) as a generic abstraction for sequential decision making.

Continuous Control Decision Making +2

140

Paper
Code

Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN

no code implementations • 3 May 2023 • Yuxuan Liu, Nikhil Mishra, Pieter Abbeel, Xi Chen

Existing state-of-the-art methods are often unable to capture meaningful uncertainty in challenging or ambiguous scenes, and as such can cause critical errors in high-performance applications.

Instance Segmentation Object Recognition +2

Paper
Add Code

RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning

1 code implementation • 9 Apr 2023 • Kevin Zakka, Philipp Wu, Laura Smith, Nimrod Gileadi, Taylor Howell, Xue Bin Peng, Sumeet Singh, Yuval Tassa, Pete Florence, Andy Zeng, Pieter Abbeel

Replicating human-like dexterity in robot hands represents one of the largest open problems in robotics.

Benchmarking Multi-Task Learning +2

530

Paper
Code

Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?

no code implementations • NeurIPS 2023 • Arjun Majumdar, Karmesh Yadav, Sergio Arnaud, Yecheng Jason Ma, Claire Chen, Sneha Silwal, Aryan Jain, Vincent-Pierre Berges, Pieter Abbeel, Jitendra Malik, Dhruv Batra, Yixin Lin, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier

Contrary to inferences from prior work, we find that scaling dataset size and diversity does not improve performance universally (but does so on average).

Paper
Add Code

Foundation Models for Decision Making: Problems, Methods, and Opportunities

no code implementations • 7 Mar 2023 • Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, Dale Schuurmans

In response to these developments, new paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.

Autonomous Driving Decision Making +1

Paper
Add Code

Preference Transformer: Modeling Human Preferences using Transformers for RL

1 code implementation • 2 Mar 2023 • Changyeon Kim, Jongjin Park, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

In this paper, we present Preference Transformer, a neural architecture that models human preferences using transformers.

Decision Making Reinforcement Learning (RL)

132

Paper
Code

Aligning Text-to-Image Models using Human Feedback

no code implementations • 23 Feb 2023 • Kimin Lee, Hao liu, MoonKyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu

Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.

Image Generation

Paper
Add Code

Robust and Versatile Bipedal Jumping Control through Reinforcement Learning

no code implementations • 19 Feb 2023 • Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Guiding Pretraining in Reinforcement Learning with Large Language Models

1 code implementation • 13 Feb 2023 • Yuqing Du, Olivia Watkins, Zihan Wang, Cédric Colas, Trevor Darrell, Pieter Abbeel, Abhishek Gupta, Jacob Andreas

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function.

Common Sense Reasoning Language Modelling +2

Paper
Code

Controllability-Aware Unsupervised Skill Discovery

3 code implementations • 10 Feb 2023 • Seohong Park, Kimin Lee, Youngwoon Lee, Pieter Abbeel

One of the key capabilities of intelligent agents is the ability to discover useful skills without external supervision.

Paper
Code

The Wisdom of Hindsight Makes Language Models Better Instruction Followers

1 code implementation • 10 Feb 2023 • Tianjun Zhang, Fangchen Liu, Justin Wong, Pieter Abbeel, Joseph E. Gonzalez

In this paper, we consider an alternative approach: converting feedback to instruction by relabeling the original one and training the model for better alignment in a supervised manner.

Decision Making Language Modelling +2

155

Paper
Code

Chain of Hindsight Aligns Language Models with Feedback

3 code implementations • 6 Feb 2023 • Hao liu, Carmelo Sferrazza, Pieter Abbeel

Applying our method to large language models, we observed that Chain of Hindsight significantly surpasses previous methods in aligning language models with human preferences.

205

Paper
Code

Multi-View Masked World Models for Visual Robotic Manipulation

1 code implementation • 5 Feb 2023 • Younggyo Seo, Junsu Kim, Stephen James, Kimin Lee, Jinwoo Shin, Pieter Abbeel

In this paper, we investigate how to learn good representations with multi-view data and utilize them for visual robotic manipulation.

Camera Calibration Representation Learning

Paper
Code

Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment

1 code implementation • NeurIPS 2023 • Hao liu, Wilson Yan, Pieter Abbeel

Recent progress in scaling up large language models has shown impressive capabilities in performing few-shot learning across a wide range of text-based tasks.

Attribute Few-Shot Image Classification +3

Paper
Code

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

no code implementations • 23 Nov 2022 • David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch, Ofir Nachum

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications.

Decision Making

Paper
Add Code

Masked Autoencoding for Scalable and Generalizable Decision Making

1 code implementation • 23 Nov 2022 • Fangchen Liu, Hao liu, Aditya Grover, Pieter Abbeel

We are interested in learning scalable agents for reinforcement learning that can learn from large-scale, diverse sequential data similar to current large vision and language models.

Decision Making Offline RL +2

Paper
Code

VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models

no code implementations • CVPR 2023 • Ajay Jain, Amber Xie, Pieter Abbeel

We show that a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics.

Image Generation Text to 3D +1

Paper
Add Code

StereoPose: Category-Level 6D Transparent Object Pose Estimation from Stereo Images via Back-View NOCS

no code implementations • 3 Nov 2022 • Kai Chen, Stephen James, Congying Sui, Yun-hui Liu, Pieter Abbeel, Qi Dou

To further improve the performance of the stereo framework, StereoPose is equipped with a parallax attention module for stereo feature fusion and an epipolar loss for improving the stereo-view consistency of network predictions.

Object Pose Estimation +1

Paper
Add Code

Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data

1 code implementation • 25 Oct 2022 • John So, Amber Xie, Sunggoo Jung, Jeffrey Edlund, Rohan Thakker, Ali Agha-mohammadi, Pieter Abbeel, Stephen James

In this paper, we address this challenge by presenting Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving, without using any real-world data.

Autonomous Driving Reinforcement Learning (RL) +2

Paper
Code

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations • 24 Oct 2022 • Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Paper
Add Code

Dichotomy of Control: Separating What You Can Control from What You Cannot

1 code implementation • 24 Oct 2022 • Mengjiao Yang, Dale Schuurmans, Pieter Abbeel, Ofir Nachum

While return-conditioning is at the heart of popular algorithms such as decision transformer (DT), these methods tend to perform poorly in highly stochastic environments, where an occasional high return can arise from randomness in the environment rather than the actions themselves.

Reinforcement Learning (RL)

32,798

Paper
Code

Instruction-Following Agents with Multimodal Transformer

1 code implementation • 24 Oct 2022 • Hao liu, Lisa Lee, Kimin Lee, Pieter Abbeel

Our \ours method consists of a multimodal transformer that encodes visual observations and language instructions, and a transformer-based policy that predicts actions based on encoded representations.

Instruction Following Visual Grounding

Paper
Code

Spending Thinking Time Wisely: Accelerating MCTS with Virtual Expansions

no code implementations • 23 Oct 2022 • Weirui Ye, Pieter Abbeel, Yang Gao

This paper proposes the Virtual MCTS (V-MCTS), a variant of MCTS that spends more search time on harder states and less search time on simpler states adaptively.

Atari Games Board Games

Paper
Add Code

CLUTR: Curriculum Learning via Unsupervised Task Representation Learning

1 code implementation • 19 Oct 2022 • Abdus Salam Azad, Izzeddin Gur, Jasper Emhoff, Nathaniel Alexis, Aleksandra Faust, Pieter Abbeel, Ion Stoica

Recently, Unsupervised Environment Design (UED) emerged as a new paradigm for zero-shot generalization by simultaneously learning a task distribution and agent policies on the generated tasks.

Reinforcement Learning (RL) Representation Learning +1

Paper
Code

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

1 code implementation • 14 Oct 2022 • Ademi Adeniji, Amber Xie, Pieter Abbeel

While unsupervised skill discovery has shown promise in autonomously acquiring behavioral primitives, there is still a large methodological disconnect between task-agnostic skill pretraining and downstream, task-aware finetuning.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Code

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction

no code implementations • 13 Oct 2022 • Yuxuan Liu, Nikhil Mishra, Maximilian Sieb, Yide Shentu, Pieter Abbeel, Xi Chen

3D bounding boxes are a widespread intermediate representation in many computer vision applications.

Paper
Add Code

Real-World Robot Learning with Masked Visual Pre-training

1 code implementation • 6 Oct 2022 • Ilija Radosavovic, Tete Xiao, Stephen James, Pieter Abbeel, Jitendra Malik, Trevor Darrell

Finally, we train a 307M parameter vision transformer on a massive collection of 4. 5M images from the Internet and egocentric videos, and demonstrate clearly the benefits of scaling visual pre-training for robot learning.

192

Paper
Code

Temporally Consistent Transformers for Video Generation

1 code implementation • 5 Oct 2022 • Wilson Yan, Danijar Hafner, Stephen James, Pieter Abbeel

To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world.

Video Generation Video Prediction

Paper
Code

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks

1 code implementation • 16 Sep 2022 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

On a set of 26 benchmark Atari environments, MeanQ outperforms all tested baselines, including the best available baseline, SUNRISE, at 100K interaction steps in 16/26 environments, and by 68% on average.

Paper
Code

Multi-Objective Policy Gradients with Topological Constraints

no code implementations • 15 Sep 2022 • Kyle Hollins Wray, Stas Tiomkin, Mykel J. Kochenderfer, Pieter Abbeel

Multi-objective optimization models that encode ordered sequential constraints provide a solution to model various challenging problems including encoding preferences, modeling a curriculum, and enforcing measures of safety.

Paper
Add Code

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

no code implementations • 15 Sep 2022 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel

Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.

Data Augmentation Video Prediction +1

Paper
Add Code

AdaCat: Adaptive Categorical Discretization for Autoregressive Models

1 code implementation • 3 Aug 2022 • Qiyang Li, Ajay Jain, Pieter Abbeel

Autoregressive generative models can estimate complex continuous data distributions, like trajectory rollouts in an RL environment, image intensities, and audio.

Density Estimation Offline RL

Paper
Code

Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision

1 code implementation • 29 Jun 2022 • Ryan Hoque, Lawrence Yunliang Chen, Satvik Sharma, Karthik Dharmarajan, Brijen Thananjeyan, Pieter Abbeel, Ken Goldberg

With continual learning, interventions from the remote pool of humans can also be used to improve the robot fleet control policy over time.

Continual Learning

Paper
Code

Masked World Models for Visual Control

no code implementations • 28 Jun 2022 • Younggyo Seo, Danijar Hafner, Hao liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel

Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects.

Model-based Reinforcement Learning Reinforcement Learning (RL) +1

Paper
Add Code

DayDreamer: World Models for Physical Robot Learning

1 code implementation • 28 Jun 2022 • Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, Pieter Abbeel

Learning a world model to predict the outcomes of potential actions enables planning in imagination, reducing the amount of trial and error needed in the real environment.

Navigate reinforcement-learning +1

228

Paper
Code

Patch-based Object-centric Transformers for Efficient Video Generation

1 code implementation • 8 Jun 2022 • Wilson Yan, Ryo Okumura, Stephen James, Pieter Abbeel

In this work, we present Patch-based Object-centric Video Transformer (POVT), a novel region-based video generation architecture that leverages object-centric information to efficiently model temporal dynamics in videos.

Object Video Editing +2

Paper
Code

Deep Hierarchical Planning from Pixels

1 code implementation • 8 Jun 2022 • Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel

Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.

Atari Games Hierarchical Reinforcement Learning

Paper
Code

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

no code implementations • 7 Jun 2022 • Zhao Mandi, Pieter Abbeel, Stephen James

From these findings, we advocate for evaluating future meta-RL methods on more challenging tasks and including multi-task pretraining with fine-tuning as a simple, yet strong baseline.

Meta-Learning Meta Reinforcement Learning +4

Paper
Add Code

Multimodal Masked Autoencoders Learn Transferable Representations

1 code implementation • 27 May 2022 • Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel

We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.

Contrastive Learning

Paper
Code

Reward Uncertainty for Exploration in Preference-based Reinforcement Learning

2 code implementations • ICLR 2022 • Xinran Liang, Katherine Shu, Kimin Lee, Pieter Abbeel

Our intuition is that disagreement in learned reward model reflects uncertainty in tailored human feedback and could be useful for exploration.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Chain of Thought Imitation with Procedure Cloning

1 code implementation • 22 May 2022 • Mengjiao Yang, Dale Schuurmans, Pieter Abbeel, Ofir Nachum

Imitation learning aims to extract high-performance policies from logged demonstrations of expert behavior.

Imitation Learning Robot Manipulation

32,798

Paper
Code

An Empirical Investigation of Representation Learning for Imitation

2 code implementations • 16 May 2022 • Xin Chen, Sam Toyer, Cody Wild, Scott Emmons, Ian Fischer, Kuang-Huei Lee, Neel Alex, Steven H Wang, Ping Luo, Stuart Russell, Pieter Abbeel, Rohin Shah

We propose a modular framework for constructing representation learning algorithms, then use our framework to evaluate the utility of representation learning for imitation across several environment suites.

Image Classification Imitation Learning +1

Paper
Code

Coarse-to-fine Q-attention with Tree Expansion

1 code implementation • 26 Apr 2022 • Stephen James, Pieter Abbeel

Coarse-to-fine Q-attention enables sample-efficient robot manipulation by discretizing the translation space in a coarse-to-fine manner, where the resolution gradually increases at each layer in the hierarchy.

Robot Manipulation

143

Paper
Code

Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking

no code implementations • 14 Apr 2022 • Kai Chen, Rui Cao, Stephen James, Yichuan Li, Yun-hui Liu, Pieter Abbeel, Qi Dou

To continuously improve the quality of pseudo labels, we iterate the above steps by taking the trained student model as a new teacher and re-label real data using the refined teacher model.

6D Pose Estimation using RGB Robotic Grasping

Paper
Add Code

Imitating, Fast and Slow: Robust learning from demonstrations via decision-time planning

no code implementations • 7 Apr 2022 • Carl Qi, Pieter Abbeel, Aditya Grover

The goal of imitation learning is to mimic expert behavior from demonstrations, without access to an explicit reward signal.

Imitation Learning reinforcement-learning +2

Paper
Add Code

Coarse-to-Fine Q-attention with Learned Path Ranking

1 code implementation • 4 Apr 2022 • Stephen James, Pieter Abbeel

We propose Learned Path Ranking (LPR), a method that accepts an end-effector goal pose, and learns to rank a set of goal-reaching paths generated from an array of path generating methods, including: path planning, Bezier curve sampling, and a learned policy.

Benchmarking

143

Paper
Code

Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design

1 code implementation • 29 Mar 2022 • Kourosh Hakhamaneshi, Marcel Nassar, Mariano Phielipp, Pieter Abbeel, Vladimir Stojanović

We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties with up to 10x more sample efficiency compared to a randomly initialized model.

Paper
Code

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

no code implementations • 28 Mar 2022 • Alejandro Escontrela, Xue Bin Peng, Wenhao Yu, Tingnan Zhang, Atil Iscen, Ken Goldberg, Pieter Abbeel

We also demonstrate that an effective style reward can be learned from a few seconds of motion capture data gathered from a German Shepherd and leads to energy-efficient locomotion strategies with natural gait transitions.

Paper
Add Code

Reinforcement Learning with Action-Free Pre-Training from Videos

2 code implementations • 25 Mar 2022 • Younggyo Seo, Kimin Lee, Stephen James, Pieter Abbeel

Our framework consists of two phases: we pre-train an action-free latent video prediction model, and then utilize the pre-trained representations for efficiently learning action-conditional world models on unseen environments.

reinforcement-learning Reinforcement Learning (RL) +2

102

Paper
Code

Teachable Reinforcement Learning via Advice Distillation

1 code implementation • NeurIPS 2021 • Olivia Watkins, Trevor Darrell, Pieter Abbeel, Jacob Andreas, Abhishek Gupta

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human expert, and learning from intermediate forms of supervision (like binary preferences) is time-consuming and extracts little information from each human intervention.

Imitation Learning reinforcement-learning +1

Paper
Code

SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning

no code implementations • ICLR 2022 • Jongjin Park, Younggyo Seo, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.

Data Augmentation Reinforcement Learning (RL)

Paper
Add Code

It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation

no code implementations • 22 Feb 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover

Training such agents efficiently requires automatic generation of a goal curriculum.

Paper
Add Code

Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

1 code implementation • 8 Feb 2022 • Stephen James, Pieter Abbeel

We propose a new policy parameterization for representing 3D rotations during reinforcement learning.

Continuous Control reinforcement-learning +2

Paper
Code

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

1 code implementation • 1 Feb 2022 • Michael Laskin, Hao liu, Xue Bin Peng, Denis Yarats, Aravind Rajeswaran, Pieter Abbeel

We introduce Contrastive Intrinsic Control (CIC), an algorithm for unsupervised skill discovery that maximizes the mutual information between state-transitions and latent skill vectors.

Contrastive Learning reinforcement-learning +2

Paper
Code

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning

1 code implementation • 31 Jan 2022 • Denis Yarats, David Brandfonbrener, Hao liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto

In this work, we propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL.

Offline RL reinforcement-learning +1

Paper
Code

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

1 code implementation • 29 Jan 2022 • Julius Frost, Olivia Watkins, Eric Weiner, Pieter Abbeel, Trevor Darrell, Bryan Plummer, Kate Saenko

In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time.

counterfactual Decision Making +2

Paper
Code

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

1 code implementation • 18 Jan 2022 • Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch

However, the plans produced naively by LLMs often cannot map precisely to admissible actions.

Robot Task Planning World Knowledge

227

Paper
Code

Target Entropy Annealing for Discrete Soft Actor-Critic

no code implementations • 6 Dec 2021 • Yaosheng Xu, Dailin Hu, Litian Liang, Stephen Mcaleer, Pieter Abbeel, Roy Fox

Soft Actor-Critic (SAC) is considered the state-of-the-art algorithm in continuous action space settings.

Atari Games Scheduling

Paper
Add Code

Zero-Shot Text-Guided Object Generation with Dream Fields

4 code implementations • CVPR 2022 • Ajay Jain, Ben Mildenhall, Jonathan T. Barron, Pieter Abbeel, Ben Poole

Our method, Dream Fields, can generate the geometry and color of a wide range of objects without 3D supervision.

Neural Rendering Object

32,791

Paper
Code

Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL

no code implementations • NeurIPS 2021 • Charles Packer, Pieter Abbeel, Joseph E. Gonzalez

Meta-reinforcement learning (meta-RL) has proven to be a successful framework for leveraging experience from prior tasks to rapidly learn new related tasks, however, current meta-RL approaches struggle to learn in sparse reward environments.

Meta Reinforcement Learning

Paper
Add Code

Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

no code implementations • 28 Nov 2021 • Dailin Hu, Pieter Abbeel, Roy Fox

Maximum Entropy Reinforcement Learning (MaxEnt RL) algorithms such as Soft Q-Learning (SQL) and Soft Actor-Critic trade off reward and policy entropy, which has the potential to improve training stability and robustness.

Q-Learning reinforcement-learning +2

Paper
Add Code

Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning

no code implementations • 4 Nov 2021 • Wenlong Huang, Igor Mordatch, Pieter Abbeel, Deepak Pathak

We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size.

Multi-Task Learning Object +2

Paper
Add Code

B-Pref: Benchmarking Preference-Based Reinforcement Learning

1 code implementation • 4 Nov 2021 • Kimin Lee, Laura Smith, Anca Dragan, Pieter Abbeel

However, it is difficult to quantify the progress in preference-based RL due to the lack of a commonly adopted benchmark.

Benchmarking reinforcement-learning +1

101

Paper
Code

Mastering Atari Games with Limited Data

3 code implementations • NeurIPS 2021 • Weirui Ye, Shaohuai Liu, Thanard Kurutach, Pieter Abbeel, Yang Gao

Recently, there has been significant progress in sample efficient image-based RL algorithms; however, consistent human-level performance on the Atari game benchmark remains an elusive goal.

Ranked #2 on Atari Games 100k on Atari 100k

Atari Games Atari Games 100k

2,376

Paper
Code

URLB: Unsupervised Reinforcement Learning Benchmark

1 code implementation • 28 Oct 2021 • Michael Laskin, Denis Yarats, Hao liu, Kimin Lee, Albert Zhan, Kevin Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve a range of complex yet specific control tasks.

Continuous Control reinforcement-learning +2

321

Paper
Code

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

no code implementations • 28 Oct 2021 • Litian Liang, Yaosheng Xu, Stephen Mcaleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, Roy Fox

Under the belief that $\beta$ is closely related to the (state dependent) model uncertainty, Entropy Regularized Q-Learning (EQL) further introduces a principled scheduling of $\beta$ by maintaining a collection of the model parameters that characterizes model uncertainty.

Q-Learning Scheduling

Paper
Add Code

Towards More Generalizable One-shot Visual Imitation Learning

no code implementations • 26 Oct 2021 • Zhao Mandi, Fangchen Liu, Kimin Lee, Pieter Abbeel

We then study the multi-task setting, where multi-task training is followed by (i) one-shot imitation on variations within the training tasks, (ii) one-shot imitation on new tasks, and (iii) fine-tuning on new tasks.

Contrastive Learning Imitation Learning +2

Paper
Add Code

Autoregressive Latent Video Prediction with High-Fidelity Image Generator

no code implementations • 29 Sep 2021 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel

Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.

Data Augmentation Video Prediction +1

Paper
Add Code

Semi-supervised Offline Reinforcement Learning with Pre-trained Decision Transformers

no code implementations • 29 Sep 2021 • Catherine Cang, Kourosh Hakhamaneshi, Ryan Rudes, Igor Mordatch, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

In this paper, we investigate how we can leverage large reward-free (i. e. task-agnostic) offline datasets of prior interactions to pre-train agents that can then be fine-tuned using a small reward-annotated dataset.

D4RL Offline RL +2

Paper
Add Code

Pretraining for Language Conditioned Imitation with Transformers

no code implementations • 29 Sep 2021 • Aaron L Putterman, Kevin Lu, Igor Mordatch, Pieter Abbeel

We study reinforcement learning (RL) agents which can utilize language inputs.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

It Takes Four to Tango: Multiagent Self Play for Automatic Curriculum Generation

1 code implementation • ICLR 2022 • Yuqing Du, Pieter Abbeel, Aditya Grover

We are interested in training general-purpose reinforcement learning agents that can solve a wide variety of goals.

Paper
Code

Improving Long-Horizon Imitation Through Language Prediction

no code implementations • 29 Sep 2021 • Donald Joseph Hejna III, Pieter Abbeel, Lerrel Pinto

Complex, long-horizon planning and its combinatorial nature pose steep challenges for learning-based agents.

Paper
Add Code

APS: Active Pretraining with Successor Features

no code implementations • 31 Aug 2021 • Hao liu, Pieter Abbeel

We introduce a new unsupervised pretraining objective for reinforcement learning.

Ranked #5 on Unsupervised Reinforcement Learning on URLB (states, 5*10^5 frames)

Unsupervised Reinforcement Learning

Paper
Add Code

Skill Preferences: Learning to Extract and Execute Robotic Skills from Human Feedback

no code implementations • 11 Aug 2021 • Xiaofei Wang, Kimin Lee, Kourosh Hakhamaneshi, Pieter Abbeel, Michael Laskin

A promising approach to solving challenging long-horizon tasks has been to extract behavior priors (skills) by fitting generative models to large offline datasets of demonstrations.

Paper
Add Code

Playful Interactions for Representation Learning

no code implementations • 19 Jul 2021 • Sarah Young, Jyothish Pari, Pieter Abbeel, Lerrel Pinto

In this work, we propose to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks.

Imitation Learning Representation Learning

Paper
Add Code

Hierarchical Few-Shot Imitation with Skill Transition Models

1 code implementation • ICML Workshop URL 2021 • Kourosh Hakhamaneshi, Ruihan Zhao, Albert Zhan, Pieter Abbeel, Michael Laskin

To this end, we present Few-shot Imitation with Skill Transition Models (FIST), an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks given a few downstream demonstrations.

Paper
Code

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations • 5 Jul 2021 • Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning

Paper
Add Code

Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble

1 code implementation • 1 Jul 2021 • SeungHyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin

Recent advance in deep offline reinforcement learning (RL) has made it possible to train strong robotic agents from offline datasets.

Offline RL reinforcement-learning +1

Paper
Code

Scenic4RL: Programmatic Modeling and Generation of Reinforcement Learning Environments

no code implementations • 18 Jun 2021 • Abdus Salam Azad, Edward Kim, Qiancheng Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Sanjit A. Seshia

To showcase the benefits, we interfaced SCENIC to an existing RTS environment Google Research Football(GRF) simulator and introduced a benchmark consisting of 32 realistic scenarios, encoded in SCENIC, to train RL agents and testing their generalization capabilities.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Behavioral Priors and Dynamics Models: Improving Performance and Domain Transfer in Offline RL

no code implementations • 16 Jun 2021 • Catherine Cang, Aravind Rajeswaran, Pieter Abbeel, Michael Laskin

When combined together, they substantially improve the performance and generalization of offline RL policies.

D4RL Domain Generalization +2

Paper
Add Code

Unsupervised Learning of Visual 3D Keypoints for Control

1 code implementation • 14 Jun 2021 • Boyuan Chen, Pieter Abbeel, Deepak Pathak

Prior works show that structured latent space such as visual keypoints often outperforms unstructured representations for robotic control.

Paper
Code

Data-Efficient Exploration with Self Play for Atari

no code implementations • ICML Workshop URL 2021 • Michael Laskin, Catherine Cang, Ryan Rudes, Pieter Abbeel

To alleviate the reliance on reward engineering it is important to develop RL algorithms capable of efficiently acquiring skills with no rewards extrinsic to the agent.

Efficient Exploration Reinforcement Learning (RL)

Paper
Add Code

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

2 code implementations • 9 Jun 2021 • Kimin Lee, Laura Smith, Pieter Abbeel

We also show that our method is able to utilize real-time human feedback to effectively prevent reward exploitation and learn new behaviors that are difficult to specify with standard reward functions.

reinforcement-learning Reinforcement Learning (RL) +1

101

Paper
Code

JUMBO: Scalable Multi-task Bayesian Optimization using Offline Data

1 code implementation • 2 Jun 2021 • Kourosh Hakhamaneshi, Pieter Abbeel, Vladimir Stojanovic, Aditya Grover

Such a decomposition can dynamically control the reliability of information derived from the online and offline data and the use of pretrained neural networks permits scalability to large offline datasets.

Bayesian Optimization Gaussian Processes

Paper
Code

Decision Transformer: Reinforcement Learning via Sequence Modeling

16 code implementations • NeurIPS 2021 • Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.

Ranked #3 on Offline RL on D4RL

Atari Games D4RL +5

2,534

Paper
Code

VideoGPT: Video Generation using VQ-VAE and Transformers

3 code implementations • 20 Apr 2021 • Wilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas

We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.

Ranked #3 on Video Generation on UCF-101 16 frames, 128x128, Unconditional

Position Video Generation

878

Paper
Code

Auto-Tuned Sim-to-Real Transfer

1 code implementation • 15 Apr 2021 • Yuqing Du, Olivia Watkins, Trevor Darrell, Pieter Abbeel, Deepak Pathak

Policies trained in simulation often fail when transferred to the real world due to the `reality gap' where the simulator is unable to accurately capture the dynamics and visual properties of the real world.

Paper
Code

Learning What To Do by Simulating the Past

1 code implementation • ICLR 2021 • David Lindner, Rohin Shah, Pieter Abbeel, Anca Dragan

Since reward functions are hard to specify, recent work has focused on learning policies from human feedback.

Paper
Code

GEM: Group Enhanced Model for Learning Dynamical Control Systems

no code implementations • 7 Apr 2021 • Philippe Hansen-Estruch, Wenling Shang, Lerrel Pinto, Pieter Abbeel, Stas Tiomkin

In this work, we take advantage of these structures to build effective dynamical models that are amenable to sample-based learning.

Continuous Control Model-based Reinforcement Learning

Paper
Add Code

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

3 code implementations • 5 Apr 2021 • Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.

Imitation Learning Reinforcement Learning (RL)

1,612

Paper
Code

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

2 code implementations • ICCV 2021 • Ajay Jain, Matthew Tancik, Pieter Abbeel

We present DietNeRF, a 3D neural scene representation estimated from a few images.

Image Reconstruction Novel View Synthesis

263

Paper
Code

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

no code implementations • 26 Mar 2021 • Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Developing robust walking controllers for bipedal robots is a challenging endeavor.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Mutual Information State Intrinsic Control

2 code implementations • ICLR 2021 • Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu

Reinforcement learning has been shown to be highly successful at many challenging tasks.

Paper
Code

Pretrained Transformers as Universal Computation Engines

4 code implementations • 9 Mar 2021 • Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning -- in particular, without finetuning of the self-attention and feedforward layers of the residual blocks.

239

Paper
Code

Behavior From the Void: Unsupervised Active Pre-Training

1 code implementation • NeurIPS 2021 • Hao liu, Pieter Abbeel

We introduce a new unsupervised pre-training method for reinforcement learning called APT, which stands for Active Pre-Training.

Ranked #3 on Unsupervised Reinforcement Learning on URLB (states, 10^5 frames)

Atari Games Reinforcement Learning (RL) +2

321

Paper
Code

Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

1 code implementation • NeurIPS 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel

Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations.

Ranked #33 on Atari Games on Atari 2600 Amidar

Atari Games Computational Efficiency +3

Paper
Code

Task-Agnostic Morphology Evolution

1 code implementation • ICLR 2021 • Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto

Deep reinforcement learning primarily focuses on learning behavior, usually overlooking the fact that an agent's function is largely determined by form.

Paper
Code

State Entropy Maximization with Random Encoders for Efficient Exploration

2 code implementations • ICLR Workshop SSL-RL 2021 • Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee

Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL).

Efficient Exploration Reinforcement Learning (RL)

31,055

Paper
Code

MSA Transformer

1 code implementation • 13 Feb 2021 • Roshan Rao, Jason Liu, Robert Verkuil, Joshua Meier, John F. Canny, Pieter Abbeel, Tom Sercu, Alexander Rives

Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins.

Masked Language Modeling Multiple Sequence Alignment +1

1,140

Paper
Code

Bottleneck Transformers for Visual Recognition

13 code implementations • CVPR 2021 • Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84. 7% top-1 accuracy on the ImageNet benchmark while being up to 1. 64x faster in compute time than the popular EfficientNet models on TPU-v3 hardware.

Ranked #52 on Instance Segmentation on COCO minival

Image Classification Instance Segmentation +3

29,735

Paper
Code

Reinforcement Learning with Latent Flow

2 code implementations • NeurIPS 2021 • Wenling Shang, Xiaofei Wang, Aravind Srinivas, Aravind Rajeswaran, Yang Gao, Pieter Abbeel, Michael Laskin

Temporal information is essential to learning effective policies with Reinforcement Learning (RL).

Ranked #1 on Montezuma's Revenge on Atari 2600 Montezuma's Revenge

Continuous Control Montezuma's Revenge +4

Paper
Code

R-LAtte: Attention Module for Visual Control via Reinforcement Learning

no code implementations • 1 Jan 2021 • Mandi Zhao, Qiyang Li, Aravind Srinivas, Ignasi Clavera, Kimin Lee, Pieter Abbeel

Attention mechanisms are generic inductive biases that have played a critical role in improving the state-of-the-art in supervised learning, unsupervised pre-training and generative modeling for multiple domains including vision, language and speech.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Robust Imitation via Decision-Time Planning

no code implementations • 1 Jan 2021 • Carl Qi, Pieter Abbeel, Aditya Grover

The goal of imitation learning is to mimic expert behavior from demonstrations, without access to an explicit reward signal.

Imitation Learning reinforcement-learning +2

Paper
Add Code

VideoGen: Generative Modeling of Videos using VQ-VAE and Transformers

no code implementations • 1 Jan 2021 • Yunzhi Zhang, Wilson Yan, Pieter Abbeel, Aravind Srinivas

We present VideoGen: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.

Position Video Generation

Paper
Add Code

Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets

no code implementations • 1 Jan 2021 • SeungHyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin

As it turns out, fine-tuning offline RL agents is a non-trivial challenge, due to distribution shift – the agent encounters out-of-distribution samples during online interaction, which may cause bootstrapping error in Q-learning and instability during fine-tuning.

D4RL Offline RL +3

Paper
Add Code

Unsupervised Active Pre-Training for Reinforcement Learning

no code implementations • 1 Jan 2021 • Hao liu, Pieter Abbeel

On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult for training from scratch.

Atari Games Contrastive Learning +3

Paper
Add Code

Compute- and Memory-Efficient Reinforcement Learning with Latent Experience Replay

no code implementations • 1 Jan 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel

In this paper, we present Latent Vector Experience Replay (LeVER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements without sacrificing the performance of RL agents.

Atari Games reinforcement-learning +2

Paper
Add Code

Discrete Predictive Representation for Long-horizon Planning

no code implementations • 1 Jan 2021 • Thanard Kurutach, Julia Peng, Yang Gao, Stuart Russell, Pieter Abbeel

Discrete representations have been key in enabling robots to plan at more abstract levels and solve temporally-extended tasks more efficiently for decades.

Object Reinforcement Learning (RL)

Paper
Add Code

Weighted Bellman Backups for Improved Signal-to-Noise in Q-Updates

no code implementations • 1 Jan 2021 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel

Furthermore, since our weighted Bellman backups rely on maintaining an ensemble, we investigate how weighted Bellman backups interact with other benefits previously derived from ensembles: (a) Bootstrap; (b) UCB Exploration.

Q-Learning Reinforcement Learning (RL)

Paper
Add Code

Benefits of Assistance over Reward Learning

no code implementations • 1 Jan 2021 • Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart Russell

By merging reward learning and control, assistive agents can reason about the impact of control actions on reward learning, leading to several advantages over agents based on reward learning.

Paper
Add Code

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

no code implementations • 14 Dec 2020 • Albert Zhan, Ruihan Zhao, Lerrel Pinto, Pieter Abbeel, Michael Laskin

We present Contrastive Pre-training and Data Augmentation for Efficient Robotic Learning (CoDER), a method that utilizes data augmentation and unsupervised learning to achieve sample-efficient training of real-robot arm policies from sparse rewards.

Data Augmentation reinforcement-learning +2

Paper
Add Code

Parallel Training of Deep Networks with Local Updates

1 code implementation • 7 Dec 2020 • Michael Laskin, Luke Metz, Seth Nabarro, Mark Saroufim, Badreddine Noune, Carlo Luschi, Jascha Sohl-Dickstein, Pieter Abbeel

Deep learning models trained on large data sets have been widely successful in both vision and language domains.

Paper
Code

Reset-Free Lifelong Learning with Skill-Space Planning

1 code implementation • ICLR 2021 • Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch

We propose Lifelong Skill Planning (LiSP), an algorithmic framework for non-episodic lifelong RL based on planning in an abstract space of higher-order skills.

Reinforcement Learning (RL)

Paper
Code

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

1 code implementation • NeurIPS 2020 • Younggyo Seo, Kimin Lee, Ignasi Clavera, Thanard Kurutach, Jinwoo Shin, Pieter Abbeel

Model-based reinforcement learning (RL) has shown great potential in various control tasks in terms of both sample-efficiency and final performance.

Clustering Model-based Reinforcement Learning +4

Paper
Code

LaND: Learning to Navigate from Disengagements

1 code implementation • 9 Oct 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine

However, we believe that these disengagements not only show where the system fails, which is useful for troubleshooting, but also provide a direct learning signal by which the robot can learn to navigate.

Autonomous Navigation Imitation Learning +3

Paper
Code

Decoupling Representation Learning from Reinforcement Learning

3 code implementations • 14 Sep 2020 • Adam Stooke, Kimin Lee, Pieter Abbeel, Michael Laskin

In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning.

Data Augmentation reinforcement-learning +2

2,198

Paper
Code

Visual Imitation Made Easy

no code implementations • 11 Aug 2020 • Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto

We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.

Imitation Learning

Paper
Add Code

Robust Reinforcement Learning using Adversarial Populations

1 code implementation • 4 Aug 2020 • Eugene Vinitsky, Yuqing Du, Kanaad Parvate, Kathy Jang, Pieter Abbeel, Alexandre Bayen

Reinforcement Learning (RL) is an effective tool for controller design but can struggle with issues of robustness, failing catastrophically when the underlying system dynamics are perturbed.

Out-of-Distribution Generalization reinforcement-learning +1

Paper
Code

Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning

no code implementations • 3 Aug 2020 • Xingyu Lu, Kimin Lee, Pieter Abbeel, Stas Tiomkin

Despite the significant progress of deep reinforcement learning (RL) in solving sequential decision making problems, RL agents often overfit to training environments and struggle to adapt to new, unseen environments.

Decision Making reinforcement-learning +1

Paper
Add Code

Hybrid Discriminative-Generative Training via Contrastive Learning

1 code implementation • 17 Jul 2020 • Hao Liu, Pieter Abbeel

In this paper we show that through the perspective of hybrid discriminative-generative training of energy-based models we can make a direct connection between contrastive learning and supervised learning.

Contrastive Learning Out-of-Distribution Detection

Paper
Code

Efficient Empowerment Estimation for Unsupervised Stabilization

no code implementations • ICLR 2021 • Ruihan Zhao, Kevin Lu, Pieter Abbeel, Stas Tiomkin

We demonstrate our solution for sample-based unsupervised stabilization on different dynamical control systems and show the advantages of our method by comparing it to the existing VLB approaches.

Paper
Add Code

Variable Skipping for Autoregressive Range Density Estimation

1 code implementation • ICML 2020 • Eric Liang, Zongheng Yang, Ion Stoica, Pieter Abbeel, Yan Duan, Xi Chen

In this paper, we explore a technique, variable skipping, for accelerating range density estimation over deep autoregressive models.

Data Augmentation Density Estimation

Paper
Code

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

1 code implementation • 9 Jul 2020 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel

Off-policy deep reinforcement learning (RL) has been successful in a range of challenging domains.

Efficient Exploration Ensemble Learning +3

116

Paper
Code

Contrastive Code Representation Learning

1 code implementation • EMNLP 2021 • Paras Jain, Ajay Jain, Tianjun Zhang, Pieter Abbeel, Joseph E. Gonzalez, Ion Stoica

Recent work learns contextual representations of source code by reconstructing tokens from their context.

Ranked #1 on Method name prediction on CodeSearchNet

Clone Detection Contrastive Learning +4

165

Paper
Code

Self-Supervised Policy Adaptation during Deployment

2 code implementations • ICLR 2021 • Nicklas Hansen, Rishabh Jangir, Yu Sun, Guillem Alenyà, Pieter Abbeel, Alexei A. Efros, Lerrel Pinto, Xiaolong Wang

A natural solution would be to keep training after deployment in the new environment, but this cannot be done if the new environment offers no reward signal.

109

Paper
Code

Responsive Safety in Reinforcement Learning by PID Lagrangian Methods

no code implementations • 8 Jul 2020 • Adam Stooke, Joshua Achiam, Pieter Abbeel

Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

AvE: Assistance via Empowerment

1 code implementation • NeurIPS 2020 • Yuqing Du, Stas Tiomkin, Emre Kiciman, Daniel Polani, Pieter Abbeel, Anca Dragan

One difficulty in using artificial agents for human-assistive applications lies in the challenge of accurately assisting with a person's goal(s).

Paper
Code

Locally Masked Convolution for Autoregressive Models

1 code implementation • 22 Jun 2020 • Ajay Jain, Pieter Abbeel, Deepak Pathak

For tasks such as image completion, these models are unable to use much of the observed context.

Ranked #1 on Image Generation on MNIST

Anomaly Detection Density Estimation +2

Paper
Code

Denoising Diffusion Probabilistic Models

61 code implementations • NeurIPS 2020 • Jonathan Ho, Ajay Jain, Pieter Abbeel

We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics.

Ranked #2 on Image Generation on LSUN Bedroom

Denoising Density Estimation +1

47,906

Paper
Code

Automatic Curriculum Learning through Value Disagreement

1 code implementation • NeurIPS 2020 • Yunzhi Zhang, Pieter Abbeel, Lerrel Pinto

Our key insight is that if we can sample goals at the frontier of the set of goals that an agent is able to reach, it will provide a significantly stronger learning signal compared to randomly sampled goals.

Reinforcement Learning (RL)

Paper
Code

Mutual Information Maximization for Robust Plannable Representations

no code implementations • 16 May 2020 • Yiming Ding, Ignasi Clavera, Pieter Abbeel

The later, while they present low sample complexity, they learn latent spaces that need to reconstruct every single detail of the scene.

Model-based Reinforcement Learning

Paper
Add Code

Model-Augmented Actor-Critic: Backpropagating through Paths

no code implementations • ICLR 2020 • Ignasi Clavera, Violet Fu, Pieter Abbeel

Current model-based reinforcement learning approaches use the model simply as a learned black-box simulator to augment the data for policy optimization or value function learning.

Model-based Reinforcement Learning

Paper
Add Code

Planning to Explore via Self-Supervised World Models

4 code implementations • 12 May 2020 • Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak

Reinforcement learning allows solving complex tasks, however, the learning tends to be task-specific and the sample efficiency remains a challenge.

reinforcement-learning Reinforcement Learning (RL)

208

Paper
Code

Plan2Vec: Unsupervised Representation Learning by Latent Plans

1 code implementation • 7 May 2020 • Ge Yang, Amy Zhang, Ari S. Morcos, Joelle Pineau, Pieter Abbeel, Roberto Calandra

In this paper we introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning.

Motion Planning reinforcement-learning +2

Paper
Code

Reinforcement Learning with Augmented Data

2 code implementations • NeurIPS 2020 • Michael Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas

To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.

Data Augmentation OpenAI Gym +2

397

Paper
Code

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

7 code implementations • 8 Apr 2020 • Aravind Srinivas, Michael Laskin, Pieter Abbeel

On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency of methods that use state-based features.

Ranked #1 on Continuous Control on Finger, spin (DMControl500k)

Atari Games Atari Games 100k +4

2,534

Paper
Code

Sparse Graphical Memory for Robust Planning

1 code implementation • NeurIPS 2020 • Scott Emmons, Ajay Jain, Michael Laskin, Thanard Kurutach, Pieter Abbeel, Deepak Pathak

To operate effectively in the real world, agents should be able to act from high-dimensional raw sensory input such as images and achieve diverse goals across long time-horizons.

Imitation Learning Visual Navigation

Paper
Code

Learning Predictive Representations for Deformable Objects Using Contrastive Estimation

1 code implementation • 11 Mar 2020 • Wilson Yan, Ashwin Vangipuram, Pieter Abbeel, Lerrel Pinto

Using visual model-based learning for deformable object manipulation is challenging due to difficulties in learning plannable visual representations along with complex dynamic models.

Deformable Object Manipulation

Paper
Code

Hierarchically Decoupled Imitation for Morphological Transfer

1 code implementation • 3 Mar 2020 • Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto

Learning long-range behaviors on complex high-dimensional agents is a fundamental problem in robot learning.

Paper
Code

Hallucinative Topological Memory for Zero-Shot Visual Planning

1 code implementation • ICML 2020 • Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar

In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e. g., images obtained from self-supervised robot interaction.

Paper
Code

Generalized Hindsight for Reinforcement Learning

no code implementations • NeurIPS 2020 • Alexander C. Li, Lerrel Pinto, Pieter Abbeel

Compared to standard relabeling techniques, Generalized Hindsight provides a substantially more efficient reuse of samples, which we empirically demonstrate on a suite of multi-task navigation and manipulation tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

GACEM: Generalized Autoregressive Cross Entropy Method for Multi-Modal Black Box Constraint Satisfaction

no code implementations • 17 Feb 2020 • Kourosh Hakhamaneshi, Keertana Settaluri, Pieter Abbeel, Vladimir Stojanovic

In this work we present a new method of black-box optimization and constraint satisfaction.

Policy Gradient Methods

Paper
Add Code

BADGR: An Autonomous Self-Supervised Learning-Based Navigation System

1 code implementation • 13 Feb 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine

Mobile robot navigation is typically regarded as a geometric problem, in which the robot's objective is to perceive the geometry of the environment in order to plan collision-free paths towards a desired goal.

Navigate Robot Navigation +1

140

Paper
Code

Mutual Information-based State-Control for Intrinsically Motivated Reinforcement Learning

no code implementations • 5 Feb 2020 • Rui Zhao, Yang Gao, Pieter Abbeel, Volker Tresp, Wei Xu

In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Preventing Imitation Learning with Adversarial Policy Ensembles

no code implementations • 31 Jan 2020 • Albert Zhan, Stas Tiomkin, Pieter Abbeel

To our knowledge, this is the first work regarding the protection of policies in Reinforcement Learning.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Hierarchical Variational Imitation Learning of Control Programs

1 code implementation • 29 Dec 2019 • Roy Fox, Richard Shin, William Paul, Yitian Zou, Dawn Song, Ken Goldberg, Pieter Abbeel, Ion Stoica

Autonomous agents can learn by imitating teacher demonstrations of the intended behavior.

Imitation Learning Variational Inference

Paper
Code

Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards

no code implementations • 21 Dec 2019 • Xingyu Lu, Stas Tiomkin, Pieter Abbeel

While recent progress in deep reinforcement learning has enabled robots to learn complex behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

AVID: Learning Multi-Stage Tasks via Pixel-Level Translation of Human Videos

no code implementations • 10 Dec 2019 • Laura Smith, Nikita Dhawan, Marvin Zhang, Pieter Abbeel, Sergey Levine

In this paper, we study how these challenges can be alleviated with an automated robotic learning framework, in which multi-stage tasks are defined simply by providing videos of a human demonstrator and then learned autonomously by the robot from raw image observations.

Reinforcement Learning (RL) Translation

Paper
Add Code

Learning Efficient Representation for Intrinsic Motivation

no code implementations • 4 Dec 2019 • Ruihan Zhao, Stas Tiomkin, Pieter Abbeel

The core idea is to represent the relation between action sequences and future states using a stochastic dynamic model in latent space with a specific form.

Paper
Add Code

Adaptive Online Planning for Continual Lifelong Learning

1 code implementation • 3 Dec 2019 • Kevin Lu, Igor Mordatch, Pieter Abbeel

We study learning control in an online reset-free lifelong learning scenario, where mistakes can compound catastrophically into the future and the underlying dynamics of the environment may change.

Paper
Code

Compositional Plan Vectors

1 code implementation • NeurIPS 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

We show that CPVs can be learned within a one-shot imitation learning framework without any additional supervision or information about task hierarchy, and enable a demonstration-conditioned policy to generalize to tasks that sequence twice as many skills as the tasks seen during training.

Imitation Learning

Paper
Code

Natural Image Manipulation for Autoregressive Models Using Fisher Scores

no code implementations • 25 Nov 2019 • Wilson Yan, Jonathan Ho, Pieter Abbeel

Deep autoregressive models are one of the most powerful models that exist today which achieve state-of-the-art bits per dim.

Image Manipulation

Paper
Add Code

Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control

no code implementations • 30 Oct 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

Imitation Learning

Paper
Add Code

Learning to Manipulate Deformable Objects without Demonstrations

2 code implementations • 29 Oct 2019 • Yilin Wu, Wilson Yan, Thanard Kurutach, Lerrel Pinto, Pieter Abbeel

Second, instead of jointly learning both the pick and the place locations, we only explicitly learn the placing policy conditioned on random pick points.

Deformable Object Manipulation Object +1

Paper
Code

Geometry-Aware Neural Rendering

1 code implementation • NeurIPS 2019 • Josh Tobin, OpenAI Robotics, Pieter Abbeel

Understanding the 3-dimensional structure of the world is a core challenge in computer vision and robotics.

Neural Rendering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.