Search Results for author: Sergey Levine

Found 491 papers, 228 papers with code

Time-Contrastive Networks: Self-Supervised Learning from Video

7 code implementations • 23 Apr 2017 • Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine

While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.

Ranked #3 on Video Alignment on UPenn Action

Metric Learning reinforcement-learning +3

76,582

Paper
Code

Data-Efficient Hierarchical Reinforcement Learning

12 code implementations • NeurIPS 2018 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.

Hierarchical Reinforcement Learning reinforcement-learning +1

76,579

Paper
Code

Cognitive Mapping and Planning for Visual Navigation

6 code implementations • CVPR 2017 • Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik

The accumulated belief of the world enables the agent to track visited regions of the environment.

Visual Navigation

76,579

Paper
Code

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.

76,579

Paper
Code

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

7 code implementations • ICLR 2019 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning.

2D Human Pose Estimation Continuous Control +4

76,579

Paper
Code

Unsupervised Learning for Physical Interaction through Video Prediction

3 code implementations • NeurIPS 2016 • Chelsea Finn, Ian Goodfellow, Sergey Levine

A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment.

Ranked #26 on Video Generation on BAIR Robot Pushing

Object Video Generation +1

65,339

Paper
Code

High-Dimensional Continuous Control Using Generalized Advantage Estimation

17 code implementations • 8 Jun 2015 • John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.

Continuous Control Policy Gradient Methods +1

47,519

Paper
Code

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

1 code implementation • NeurIPS 2019 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

We introduce a general control algorithm that combines the strengths of planning and reinforcement learning to effectively solve these tasks.

reinforcement-learning Reinforcement Learning (RL)

32,745

Paper
Code

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

1 code implementation • ICLR 2022 • Mengjiao Yang, Sergey Levine, Ofir Nachum

In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.

Imitation Learning

32,745

Paper
Code

Meta-Learning without Memorization

1 code implementation • ICLR 2020 • Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn

If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes.

Ranked #19 on Few-Shot Image Classification on OMNIGLOT - 5-Shot, 20-way

Few-Shot Image Classification Memorization +1

32,744

Paper
Code

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

6 code implementations • NeurIPS 2020 • Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine

We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED).

Reinforcement Learning (RL) Transfer Learning +1

32,743

Paper
Code

Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification

1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov

Can we devise RL algorithms that instead enable users to specify tasks simply by providing examples of successful outcomes?

General Classification Reinforcement Learning (RL)

32,735

Paper
Code

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation • ICLR 2022 • Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

32,735

Paper
Code

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

76 code implementations • ICML 2018 • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine

A platform for Applied Reinforcement Learning (Applied RL)

Ranked #1 on Continuous Control on Lunar Lander (OpenAI Gym)

Continuous Control Decision Making +3

30,954

Paper
Code

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

82 code implementations • ICML 2017 • Chelsea Finn, Pieter Abbeel, Sergey Levine

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning.

Ranked #2 on Few-Shot Image Classification on OMNIGLOT - 5-Shot, 5-way

Few-Shot Image Classification General Classification +4

30,954

Paper
Code

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

3 code implementations • 4 May 2020 • Sergey Levine, Aviral Kumar, George Tucker, Justin Fu

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection.

Decision Making reinforcement-learning +1

30,954

Paper
Code

Conservative Q-Learning for Offline Reinforcement Learning

17 code implementations • NeurIPS 2020 • Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

Continuous Control DQN Replay Dataset +3

30,954

Paper
Code

Model-Based Reinforcement Learning for Atari

2 code implementations • 1 Mar 2019 • Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Ranked #12 on Atari Games 100k on Atari 100k

Atari Games Atari Games 100k +4

14,865

Paper
Code

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

1 code implementation • ICLR 2020 • Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.

Ranked #15 on Video Generation on BAIR Robot Pushing

Predict Future Video Frames Video Generation

14,865

Paper
Code

Soft Actor-Critic Algorithms and Applications

50 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Decision Making reinforcement-learning +1

10,367

Paper
Code

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

5 code implementations • 1 Oct 2019 • Xue Bin Peng, Aviral Kumar, Grace Zhang, Sergey Levine

In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines.

Ranked #1 on OpenAI Gym on Humanoid-v2

Continuous Control OpenAI Gym +3

7,951

Paper
Code

Trust Region Policy Optimization

21 code implementations • 19 Feb 2015 • John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel

We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.

Atari Games Policy Gradient Methods

7,867

Paper
Code

Reinforcement Learning with Deep Energy-Based Policies

3 code implementations • ICML 2017 • Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +1

2,505

Paper
Code

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

4 code implementations • 1 Mar 2016 • Chelsea Finn, Sergey Levine, Pieter Abbeel

We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems.

Feature Engineering

2,505

Paper
Code

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

5 code implementations • ICLR 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.

Imitation Learning Q-Learning +2

2,505

Paper
Code

When to Trust Your Model: Model-Based Policy Optimization

11 code implementations • NeurIPS 2019 • Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.

Model-based Reinforcement Learning reinforcement-learning +1

2,505

Paper
Code

Offline Reinforcement Learning with Implicit Q-Learning

15 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

2,505

Paper
Code

Planning with Diffusion for Flexible Behavior Synthesis

2 code implementations • 20 May 2022 • Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.

Decision Making Denoising +2

2,505

Paper
Code

Visual Reinforcement Learning with Imagined Goals

2 code implementations • NeurIPS 2018 • Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine

For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.

reinforcement-learning Reinforcement Learning (RL) +1

2,358

Paper
Code

Skew-Fit: State-Covering Self-Supervised Reinforcement Learning

2 code implementations • ICML 2020 • Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine

Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills.

reinforcement-learning Reinforcement Learning (RL) +1

2,358

Paper
Code

AWAC: Accelerating Online Reinforcement Learning with Offline Datasets

6 code implementations • 16 Jun 2020 • Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine

If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.

reinforcement-learning Reinforcement Learning (RL)

2,358

Paper
Code

Robust Predictable Control

1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression.

Computational Efficiency Decision Making +1

2,353

Paper
Code

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

6 code implementations • 8 Apr 2018 • Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel Van de Panne

We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills.

Motion Synthesis reinforcement-learning +1

2,180

Paper
Code

The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget

1 code implementation • ICLR 2020 • Anirudh Goyal, Yoshua Bengio, Matthew Botvinick, Sergey Levine

This is typically the case when we have a standard conditioning input, such as a state observation, and a "privileged" input, which might correspond to the goal of a task, the output of a costly planning algorithm, or communication with another agent.

reinforcement-learning Reinforcement Learning (RL) +1

2,010

Paper
Code

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

3 code implementations • 5 Apr 2021 • Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.

Imitation Learning Reinforcement Learning (RL)

1,598

Paper
Code

Adaptive Risk Minimization: Learning to Adapt to Domain Shift

3 code implementations • NeurIPS 2021 • Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.

BIG-bench Machine Learning Domain Generalization +2

1,328

Paper
Code

WILDS: A Benchmark of in-the-Wild Distribution Shifts

6 code implementations • 14 Dec 2020 • Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, Percy Liang

Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild.

1,328

Paper
Code

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

3 code implementations • NeurIPS 2019 • Aviral Kumar, Justin Fu, George Tucker, Sergey Levine

Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.

Continuous Control Q-Learning

1,197

Paper
Code

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

7 code implementations • 15 Apr 2020 • Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine

In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.

D4RL Offline RL +2

1,193

Paper
Code

RT-1: Robotics Transformer for Real-World Control at Scale

1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.

1,178

Paper
Code

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

8 code implementations • 24 Oct 2019 • Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine

Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.

Ranked #1 on Meta-Learning on ML10

Meta-Learning Meta Reinforcement Learning +3

1,102

Paper
Code

The False Promise of Imitating Proprietary LLMs

1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

1,086

Paper
Code

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

10 code implementations • NeurIPS 2018 • Kurtland Chua, Roberto Calandra, Rowan Mcallister, Sergey Levine

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance.

Model-based Reinforcement Learning reinforcement-learning +1

903

Paper
Code

Deep Dynamics Models for Learning Dexterous Manipulation

2 code implementations • 25 Sep 2019 • Anusha Nagabandi, Kurt Konoglie, Sergey Levine, Vikash Kumar

Dexterous multi-fingered hands can provide robots with the ability to flexibly perform a wide range of manipulation skills.

Model Predictive Control

809

Paper
Code

Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

2 code implementations • 2 May 2018 • Sergey Levine

The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable.

Decision Making reinforcement-learning +2

738

Paper
Code

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

5 code implementations • ICLR 2019 • Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine

By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.

Continuous Control Image Generation +1

627

Paper
Code

Guided Policy Search as Approximate Mirror Descent

1 code implementation • 15 Jul 2016 • William Montgomery, Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.

591

Paper
Code

Value Iteration Networks

8 code implementations • NeurIPS 2016 • Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.

reinforcement-learning Reinforcement Learning (RL)

553

Paper
Code

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

1 code implementation • 16 Nov 2018 • Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine

We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin.

Object Representation Learning

528

Paper
Code

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

1 code implementation • ICLR 2021 • Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.

reinforcement-learning Reinforcement Learning (RL)

506

Paper
Code

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

7 code implementations • ICLR Workshop LLD 2019 • Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine

In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.

Efficient Exploration Meta Reinforcement Learning +2

464

Paper
Code

Offline Reinforcement Learning as One Big Sequence Modeling Problem

2 code implementations • NeurIPS 2021 • Michael Janner, Qiyang Li, Sergey Levine

Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.

Imitation Learning Offline RL +2

425

Paper
Code

Reinforcement Learning as One Big Sequence Modeling Problem

1 code implementation • ICML Workshop URL 2021 • Michael Janner, Qiyang Li, Sergey Levine

However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.

Imitation Learning Offline RL +2

425

Paper
Code

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation • 19 Mar 2018 • Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning +1

409

Paper
Code

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

7 code implementations • 30 Oct 2017 • Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Ranked #3 on MuJoCo Games on Ant

Decision Making reinforcement-learning +1

385

Paper
Code

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

3 code implementations • ICLR 2019 • Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson

We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.

Imitation Learning

385

Paper
Code

End-to-End Robotic Reinforcement Learning without Reward Engineering

3 code implementations • 16 Apr 2019 • Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine

In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.

reinforcement-learning Reinforcement Learning (RL)

363

Paper
Code

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

8 code implementations • 8 Aug 2017 • Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.

Model-based Reinforcement Learning Model Predictive Control +2

327

Paper
Code

Training Diffusion Models with Reinforcement Learning

2 code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.

Decision Making Denoising +2

319

Paper
Code

OmniTact: A Multi-Directional High Resolution Touch Sensor

1 code implementation • 16 Mar 2020 • Akhil Padmanabha, Frederik Ebert, Stephen Tian, Roberto Calandra, Chelsea Finn, Sergey Levine

We compare with a state-of-the-art tactile sensor that is only sensitive on one side, as well as a state-of-the-art multi-directional tactile sensor, and find that OmniTact's combination of high-resolution and multi-directional sensing is crucial for reliably inserting the electrical connector and allows for higher accuracy in the state estimation task.

Vocal Bursts Intensity Prediction

307

Paper
Code

SFV: Reinforcement Learning of Physical Skills from Videos

1 code implementation • 8 Oct 2018 • Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine

In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV).

Pose Estimation reinforcement-learning +1

305

Paper
Code

Stochastic Adversarial Video Prediction

4 code implementations • ICLR 2019 • Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.

Ranked #1 on Video Prediction on KTH (Cond metric)

Representation Learning Video Generation +1

300

Paper
Code

Gradient Surgery for Multi-Task Learning

9 code implementations • NeurIPS 2020 • Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.

Image Classification Multi-Task Learning +1

285

Paper
Code

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

2 code implementations • 5 Feb 2018 • Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey Levine

Humans and animals are capable of learning a new behavior by observing others perform the skill just once.

Meta-Learning One-Shot Learning

282

Paper
Code

One-Shot Visual Imitation Learning via Meta-Learning

3 code implementations • 14 Sep 2017 • Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine

In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration.

Imitation Learning Meta-Learning

282

Paper
Code

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.

Object Question Answering +1

264

Paper
Code

Adversarial Policies: Attacking Deep Reinforcement Learning

2 code implementations • ICLR 2020 • Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell

Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.

reinforcement-learning Reinforcement Learning (RL)

263

Paper
Code

Inter-Level Cooperation in Hierarchical Reinforcement Learning

1 code implementation • 5 Dec 2019 • Abdul Rahman Kreidieh, Glen Berseth, Brandon Trabucco, Samyak Parajuli, Sergey Levine, Alexandre M. Bayen

This allows us to draw on connections between communication and cooperation in multi-agent RL, and demonstrate the benefits of increased cooperation between sub-policies on the training performance of the overall policy.

Hierarchical Reinforcement Learning reinforcement-learning +1

256

Paper
Code

A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

3 code implementations • 11 Nov 2016 • Chelsea Finn, Paul Christiano, Pieter Abbeel, Sergey Levine

In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator.

Imitation Learning reinforcement-learning +1

249

Paper
Code

Probabilistic Model-Agnostic Meta-Learning

1 code implementation • NeurIPS 2018 • Chelsea Finn, Kelvin Xu, Sergey Levine

However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.

Ranked #16 on Few-Shot Image Classification on Mini-ImageNet - 1-Shot Learning

Active Learning Few-Shot Image Classification +1

238

Paper
Code

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation • 16 Aug 2022 • Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

233

Paper
Code

GNM: A General Navigation Model to Drive Any Robot

1 code implementation • 7 Oct 2022 • Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data.

226

Paper
Code

MOPO: Model-based Offline Policy Optimization

6 code implementations • NeurIPS 2020 • Tianhe Yu, Garrett Thomas, Lantao Yu, Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

We also characterize the trade-off between the gain and risk of leaving the support of the batch data.

Continuous Control Offline RL +1

221

Paper
Code

COMBO: Conservative Offline Model-Based Policy Optimization

4 code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn

We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.

Offline RL Uncertainty Quantification

221

Paper
Code

Learning from the Hindsight Plan -- Episodic MPC Improvement

1 code implementation • 28 Sep 2016 • Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel

To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.

Model Predictive Control

220

Paper
Code

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

1 code implementation • 12 Sep 2022 • Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots.

212

Paper
Code

Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning

2 code implementations • ICLR 2019 • Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn

Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.

Continuous Control Meta-Learning +5

204

Paper
Code

PaLM-E: An Embodied Multimodal Language Model

2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Large language models excel at a wide range of complex tasks.

Ranked #2 on Visual Question Answering (VQA) on OK-VQA

Language Modelling Large Language Model +2

200

Paper
Code

Offline RL for Natural Language Generation with Implicit Language Q Learning

1 code implementation • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine

Large language models distill broad knowledge from text corpora.

Language Modelling Offline RL +2

189

Paper
Code

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

2 code implementations • ICML 2020 • Angelos Filos, Panagiotis Tigas, Rowan Mcallister, Nicholas Rhinehart, Sergey Levine, Yarin Gal

Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment, typically leading to arbitrary deductions and poorly-informed decisions.

Autonomous Vehicles Out of Distribution (OOD) Detection

188

Paper
Code

Dynamics-Aware Unsupervised Discovery of Skills

3 code implementations • 2 Jul 2019 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.

Model-based Reinforcement Learning

181

Paper
Code

Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning

2 code implementations • 27 Apr 2020 • Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu

Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?

Model Predictive Control reinforcement-learning +2

181

Paper
Code

Dynamics-Aware Unsupervised Skill Discovery

1 code implementation • ICLR 2020 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.

Model-based Reinforcement Learning

181

Paper
Code

Self-Supervised Visual Planning with Temporal Skip Connections

3 code implementations • 15 Oct 2017 • Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine

One learning signal that is always available for autonomously collected data is prediction: if a robot can learn to predict the future, it can use this predictive model to take actions to produce desired outcomes, such as moving an object to a particular location.

Video Prediction

179

Paper
Code

Efficient Online Reinforcement Learning with Offline Data

1 code implementation • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine

Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

176

Paper
Code

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

1 code implementation • 10 Jul 2022 • Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine

Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings.

Instruction Following Language Modelling

169

Paper
Code

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine

We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.

Continuous Control Policy Gradient Methods +2

160

Paper
Code

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

3 code implementations • 4 Apr 2022 • Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng

We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment.

Decision Making Language Modelling +1

159

Paper
Code

Few-Shot Segmentation Propagation with Guided Networks

1 code implementation • 25 May 2018 • Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine

Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.

Interactive Segmentation Segmentation +3

154

Paper
Code

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model

8 code implementations • NeurIPS 2020 • Alex X. Lee, Anusha Nagabandi, Pieter Abbeel, Sergey Levine

Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations.

Continuous Control reinforcement-learning +2

146

Paper
Code

BADGR: An Autonomous Self-Supervised Learning-Based Navigation System

1 code implementation • 13 Feb 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine

Mobile robot navigation is typically regarded as a geometric problem, in which the robot's objective is to perceive the geometry of the environment in order to plan collision-free paths towards a desired goal.

Navigate Robot Navigation +1

140

Paper
Code

Learning Invariant Representations for Reinforcement Learning without Reconstruction

2 code implementations • 18 Jun 2020 • Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey Levine

We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.

Causal Inference reinforcement-learning +2

134

Paper
Code

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

1 code implementation • 3 Dec 2018 • Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains.

reinforcement-learning Reinforcement Learning (RL)

133

Paper
Code

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp

In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.

reinforcement-learning Reinforcement Learning (RL)

124

Paper
Code

Meta-Learning with Implicit Gradients

6 code implementations • NeurIPS 2019 • Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine

By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.

Ranked #5 on Few-Shot Image Classification on OMNIGLOT - 1-Shot, 5-way

Few-Shot Image Classification Few-Shot Learning

117

Paper
Code

Recurrent Independent Mechanisms

3 code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf

Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes.

116

Paper
Code

Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

1 code implementation • 8 Dec 2020 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan

In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

111

Paper
Code

Modular Multitask Reinforcement Learning with Policy Sketches

2 code implementations • ICML 2017 • Jacob Andreas, Dan Klein, Sergey Levine

We describe a framework for multitask deep reinforcement learning guided by policy sketches.

Continuous Control reinforcement-learning +1

105

Paper
Code

Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

2 code implementations • 29 Sep 2017 • Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine

To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.

Navigate Q-Learning +3

Paper
Code

Multimodal Masked Autoencoders Learn Transferable Representations

1 code implementation • 27 May 2022 • Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel

We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.

Contrastive Learning

Paper
Code

PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings

2 code implementations • ICCV 2019 • Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine

For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.

Autonomous Vehicles

Paper
Code

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

3 code implementations • NeurIPS 2020 • Aviral Kumar, Abhishek Gupta, Sergey Levine

We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.

Ranked #3 on Meta-Learning on MT50

Meta-Learning Multi-Task Learning +3

Paper
Code

Continuous Deep Q-Learning with Model-based Acceleration

8 code implementations • 2 Mar 2016 • Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.

Continuous Control Q-Learning +2

Paper
Code

Learning Human Objectives by Evaluating Hypothetical Behavior

1 code implementation • ICML 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike

To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.

Car Racing

Paper
Code

Benchmarks for Deep Off-Policy Evaluation

3 code implementations • ICLR 2021 • Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Benchmarking Continuous Control +3

Paper
Code

BridgeData V2: A Dataset for Robot Learning at Scale

1 code implementation • 24 Aug 2023 • Homer Walke, Kevin Black, Abraham Lee, Moo Jin Kim, Max Du, Chongyi Zheng, Tony Zhao, Philippe Hansen-Estruch, Quan Vuong, Andre He, Vivek Myers, Kuan Fang, Chelsea Finn, Sergey Levine

By publicly sharing BridgeData V2 and our pre-trained models, we aim to accelerate research in scalable robot learning methods.

Imitation Learning Multi-Task Learning

Paper
Code

Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight

1 code implementation • 11 Feb 2019 • Katie Kang, Suneel Belkhale, Gregory Kahn, Pieter Abbeel, Sergey Levine

Deep reinforcement learning provides a promising approach for vision-based control of real-world robots.

Collision Avoidance reinforcement-learning +1

Paper
Code

Reinforcement Learning from Passive Data via Latent Intentions

1 code implementation • 10 Apr 2023 • Dibya Ghosh, Chethan Bhateja, Sergey Levine

Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.

reinforcement-learning Value prediction

Paper
Code

Learning Instance Segmentation by Interaction

1 code implementation • 21 Jun 2018 • Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik

The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.

Instance Segmentation Segmentation +1

Paper
Code

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

3 code implementations • 17 Feb 2022 • Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine

To address this, we present Design-Bench, a benchmark for offline MBO with a unified evaluation protocol and reference implementations of recent methods.

Paper
Code

Adversarial Policies Beat Superhuman Go AIs

2 code implementations • 1 Nov 2022 • Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack.

Paper
Code

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Paper
Code

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations • ICLR 2021 • Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

Shared Autonomy via Deep Reinforcement Learning

1 code implementation • 6 Feb 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Deep Imitative Models for Flexible Inference, Planning, and Control

1 code implementation • ICLR 2020 • Nicholas Rhinehart, Rowan Mcallister, Sergey Levine

Yet, reward functions that evoke desirable behavior are often difficult to specify.

Autonomous Driving Imitation Learning

Paper
Code

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning

1 code implementation • ICLR 2019 • Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J. Johnson, Sergey Levine

Model-based reinforcement learning (RL) has proven to be a data efficient approach for learning control tasks but is difficult to utilize in domains with complex observations such as images.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation • 20 Dec 2021 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Paper
Code

Learning to Poke by Poking: Experiential Learning of Intuitive Physics

1 code implementation • NeurIPS 2016 • Pulkit Agrawal, Ashvin Nair, Pieter Abbeel, Jitendra Malik, Sergey Levine

We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics.

Decision Making

Paper
Code

Efficient Exploration via State Marginal Matching

1 code implementation • 12 Jun 2019 • Lisa Lee, Benjamin Eysenbach, Emilio Parisotto, Eric Xing, Sergey Levine, Ruslan Salakhutdinov

The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.

Ranked #2 on Unsupervised Reinforcement Learning on URLB (states, 2*10^6 frames)

Efficient Exploration Unsupervised Reinforcement Learning

Paper
Code

FitVid: Overfitting in Pixel-Level Video Prediction

1 code implementation • 24 Jun 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.

Ranked #6 on Video Generation on BAIR Robot Pushing

Image Augmentation Video Generation +1

Paper
Code

Nonlinear Inverse Reinforcement Learning with Gaussian Processes

1 code implementation • NeurIPS 2011 • Sergey Levine, Zoran Popovic, Vladlen Koltun

We present a probabilistic algorithm for nonlinear inverse reinforcement learning.

Gaussian Processes reinforcement-learning +1

Paper
Code

MELD: Meta-Reinforcement Learning from Images via Latent State Models

1 code implementation • 26 Oct 2020 • Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine

Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.

Meta-Learning Meta Reinforcement Learning +3

Paper
Code

Divide-and-Conquer Reinforcement Learning

1 code implementation • ICLR 2018 • Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine

In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

2 code implementations • NeurIPS 2023 • Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

Paper
Code

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

1 code implementation • NeurIPS 2023 • Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.

Reinforcement Learning (RL) Unsupervised Pre-training

Paper
Code

Diversity is All You Need: Learning Skills without a Reward Function

3 code implementations • ICLR 2019 • Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine

On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.

Ranked #5 on Unsupervised Reinforcement Learning on URLB (pixels, 10^5 frames)

Meta Reinforcement Learning reinforcement-learning +2

Paper
Code

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

Paper
Code

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

2 code implementations • NAACL 2022 • Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +2

Paper
Code

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

1 code implementation • 29 Feb 2024 • Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar

In this paper, we develop a framework for building multi-turn RL algorithms for fine-tuning LLMs, that preserves the flexibility of existing single-turn RL methods for LLMs (e. g., proximal policy optimization), while accommodating multiple turns, long horizons, and delayed rewards effectively.

Language Modelling Reinforcement Learning (RL)

Paper
Code

Entity Abstraction in Visual Model-Based Reinforcement Learning

1 code implementation • 28 Oct 2019 • Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine

This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before.

Model-based Reinforcement Learning Object +5

Paper
Code

Learning to Walk in the Real World with Minimal Human Effort

1 code implementation • 20 Feb 2020 • Sehoon Ha, Peng Xu, Zhenyu Tan, Sergey Levine, Jie Tan

In this paper, we develop a system for learning legged locomotion policies with deep RL in the real world with minimal human effort.

Multi-Task Learning

Paper
Code

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation • 30 Nov 2023 • Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

Paper
Code

Foundation Policies with Hilbert Representations

1 code implementation • 23 Feb 2024 • Seohong Park, Tobias Kreiman, Sergey Levine

While a number of methods have been proposed to enable generic self-supervised RL, based on principles such as goal-conditioned RL, behavioral cloning, and unsupervised skill learning, such methods remain limited in terms of either the diversity of the discovered behaviors, the need for high-quality demonstration data, or the lack of a clear prompting or adaptation mechanism for downstream tasks.

Reinforcement Learning (RL) Unsupervised Pre-training

Paper
Code

Learning with Latent Language

1 code implementation • NAACL 2018 • Jacob Andreas, Dan Klein, Sergey Levine

The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world.

Image Classification Navigate

Paper
Code

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

1 code implementation • 11 Jul 2017 • YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.

Imitation Learning Translation +1

Paper
Code

Conservative Objective Models for Effective Offline Model-Based Optimization

2 code implementations • 14 Jul 2021 • Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

Computational design problems arise in a number of settings, from synthetic biology to computer architectures.

Paper
Code

Planning with Goal-Conditioned Policies

1 code implementation • NeurIPS 2019 • Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine

Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors.

Decision Making reinforcement-learning +3

Paper
Code

MEMO: Test Time Robustness via Adaptation and Augmentation

2 code implementations • 18 Oct 2021 • Marvin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Test-time Adaptation

Paper
Code

Evolving Reinforcement Learning Algorithms

5 code implementations • ICLR 2021 • John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.

Atari Games Meta-Learning +2

Paper
Code

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

1 code implementation • 10 Jul 2017 • Rouhollah Rahmatizadeh, Pooya Abolghasemi, Ladislau Bölöni, Sergey Levine

We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation.

Multi-Task Learning Position

Paper
Code

Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors

1 code implementation • NeurIPS 2020 • Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine

In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations.

Paper
Code

Offline Meta-Reinforcement Learning with Advantage Weighting

2 code implementations • 13 Aug 2020 • Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn

That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks in order to adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task.

Machine Translation Meta-Learning +5

Paper
Code

Causal Confusion in Imitation Learning

2 code implementations • NeurIPS 2019 • Pim de Haan, Dinesh Jayaraman, Sergey Levine

Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment.

Imitation Learning

Paper
Code

Meta-Reinforcement Learning of Structured Exploration Strategies

2 code implementations • NeurIPS 2018 • Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel, Sergey Levine

Exploration is a fundamental challenge in reinforcement learning (RL).

Meta Reinforcement Learning reinforcement-learning +1

Paper
Code

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

1 code implementation • 27 Feb 2024 • Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner?

Offline RL reinforcement-learning

Paper
Code

LaND: Learning to Navigate from Disengagements

1 code implementation • 9 Oct 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine

However, we believe that these disengagements not only show where the system fails, which is useful for troubleshooting, but also provide a direct learning signal by which the robot can learn to navigate.

Autonomous Navigation Imitation Learning +3

Paper
Code

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

1 code implementation • 13 Oct 2023 • Seohong Park, Oleh Rybkin, Sergey Levine

Through our experiments in five locomotion and manipulation environments, we demonstrate that METRA can discover a variety of useful behaviors even in complex, pixel-based environments, being the first unsupervised RL method that discovers diverse locomotion behaviors in pixel-based Quadruped and Humanoid.

Reinforcement Learning (RL) Unsupervised Pre-training +1

Paper
Code

Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation • 27 Oct 2020 • Michael Janner, Igor Mordatch, Sergey Levine

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.

Generative Adversarial Network

Paper
Code

Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation • NeurIPS 2020 • Michael Janner, Igor Mordatch, Sergey Levine

We introduce the gamma-model, a predictive model of environment dynamics with an infinite, probabilistic horizon.

Generative Adversarial Network

Paper
Code

EMI: Exploration with Mutual Information

1 code implementation • 2 Oct 2018 • Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Reinforcement learning algorithms struggle when the reward signal is very sparse.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Autonomous Evaluation and Refinement of Digital Agents

1 code implementation • 9 Apr 2024 • Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr

We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control.

Paper
Code

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations • ICLR 2022 • Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

Paper
Code

Universal Planning Networks

1 code implementation • 2 Apr 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.

Imitation Learning Representation Learning +1

Paper
Code

Learning Visual Servoing with Deep Features and Fitted Q-Iteration

2 code implementations • 31 Mar 2017 • Alex X. Lee, Sergey Levine, Pieter Abbeel

Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

1 code implementation • 16 Oct 2018 • Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine

We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks.

Robot Navigation

Paper
Code

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control

1 code implementation • ICML 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.

Imitation Learning

Paper
Code

COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning

1 code implementation • 27 Oct 2020 • Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine

Reinforcement learning has been applied to a wide variety of robotics problems, but most of such applications involve collecting data from scratch for each new task.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

1 code implementation • 21 Apr 2021 • Nicholas Rhinehart, Jeff He, Charles Packer, Matthew A. Wright, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine

Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents.

Paper
Code

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation • 24 Jun 2021 • Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Stochastic Variational Video Prediction

3 code implementations • ICLR 2018 • Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine

We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods.

Ranked #5 on Video Prediction on KTH

Video Generation Video Prediction

Paper
Code

Simple and Effective VAE Training with Calibrated Decoders

1 code implementation • 23 Jun 2020 • Oleh Rybkin, Kostas Daniilidis, Sergey Levine

We perform the first comprehensive comparative analysis of calibrated decoder and provide recommendations for simple and effective VAE training.

Paper
Code

Reinforcement Learning with Videos: Combining Offline Observations with Interaction

1 code implementation • 12 Nov 2020 • Karl Schmeckpeper, Oleh Rybkin, Kostas Daniilidis, Sergey Levine, Chelsea Finn

In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans?

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

1 code implementation • 11 Oct 2022 • Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine

To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.

Offline RL Q-Learning +1

Paper
Code

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

1 code implementation • 6 Mar 2024 • Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine

The key idea is to apply a variant of contrastive learning to time series data.

Contrastive Learning Time Series

Paper
Code

Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

1 code implementation • NeurIPS 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Predictable MDP Abstraction for Unsupervised Model-Based RL

2 code implementations • 8 Feb 2023 • Seohong Park, Sergey Levine

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

Automatically Composing Representation Transformations as a Means for Generalization

1 code implementation • ICLR 2019 • Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.

Decision Making

Paper
Code

Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads

2 code implementations • 23 Apr 2020 • Suneel Belkhale, Rachel Li, Gregory Kahn, Rowan Mcallister, Roberto Calandra, Sergey Levine

Our experiments demonstrate that our online adaptation approach outperforms non-adaptive methods on a series of challenging suspended payload transportation tasks.

Meta-Learning Meta Reinforcement Learning +2

Paper
Code

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

1 code implementation • 24 May 2022 • Siddharth Reddy, Sergey Levine, Anca D. Dragan

How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?

Paper
Code

Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets

2 code implementations • 27 Sep 2021 • Frederik Ebert, Yanlai Yang, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, Sergey Levine

Robot learning holds the promise of learning policies that generalize broadly.

Domain Generalization

Paper
Code

Mismatched No More: Joint Model-Policy Optimization for Model-Based RL

1 code implementation • 6 Oct 2021 • Benjamin Eysenbach, Alexander Khazatsky, Sergey Levine, Ruslan Salakhutdinov

Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

The Information Geometry of Unsupervised Reinforcement Learning

1 code implementation • ICLR 2022 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function.

Contrastive Learning reinforcement-learning +3

Paper
Code

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

1 code implementation • 25 Oct 2019 • Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman

We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.

Imitation Learning reinforcement-learning +1

Paper
Code

Deep Object-Centric Representations for Generalizable Robot Learning

1 code implementation • 14 Aug 2017 • Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine

We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.

Object Reinforcement Learning (RL)

Paper
Code

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

1 code implementation • 10 Nov 2020 • Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.

Continual Learning

Paper
Code

Model-Based Visual Planning with Self-Supervised Functional Distances

1 code implementation • ICLR 2021 • Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Offline Reinforcement Learning for Visual Navigation

1 code implementation • 16 Dec 2022 • Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

Paper
Code

Accelerating Exploration with Unlabeled Prior Data

1 code implementation • NeurIPS 2023 • Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine

Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.

Reinforcement Learning (RL)

Paper
Code

Diagnosing Bottlenecks in Deep Q-learning Algorithms

1 code implementation • 26 Feb 2019 • Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).

Continuous Control Q-Learning +2

Paper
Code

The Mirage of Action-Dependent Baselines in Reinforcement Learning

1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine

Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

1 code implementation • 9 Aug 2022 • Marwa Abdulhai, Natasha Jaques, Sergey Levine

IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Contextual Imagined Goals for Self-Supervised Robotic Learning

1 code implementation • 23 Oct 2019 • Ashvin Nair, Shikhar Bahl, Alexander Khazatsky, Vitchyr Pong, Glen Berseth, Sergey Levine

When the robot's environment and available objects vary, as they do in most open-world settings, the robot must propose to itself only those goals that it can accomplish in its present setting with the objects that are at hand.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Reward-Conditioned Policies

1 code implementation • 31 Dec 2019 • Aviral Kumar, Xue Bin Peng, Sergey Levine

By then conditioning the policy on the numerical value of the reward, we can obtain a policy that generalizes to larger returns.

Imitation Learning reinforcement-learning +1

Paper
Code

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

1 code implementation • ICLR 2018 • Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine

In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning

3 code implementations • 6 Oct 2018 • Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn

We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation.

Image Registration Self-Supervised Learning +1

Paper
Code

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

D4RL Offline RL +3

Paper
Code

What Can I Do Here? Learning New Skills by Imagining Visual Affordances

2 code implementations • 1 Jun 2021 • Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine

In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.

Zero-shot Generalization

Paper
Code

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

1 code implementation • 12 Oct 2023 • Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn

Can we leverage offline RL to recover better policies from online interaction?

D4RL Offline RL +2

Paper
Code

Deep Visual Foresight for Planning Robot Motion

1 code implementation • 3 Oct 2016 • Chelsea Finn, Sergey Levine

A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback.

Model-based Reinforcement Learning Model Predictive Control +2

Paper
Code

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

1 code implementation • ICML 2020 • Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman

Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Offline Meta-Reinforcement Learning with Online Self-Supervision

1 code implementation • 8 Jul 2021 • Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.

Meta Reinforcement Learning Offline RL +2

Paper
Code

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data

1 code implementation • 6 Jun 2023 • Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey Levine

Robotic systems that rely primarily on self-supervised learning have the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.

Contrastive Learning Data Augmentation +2

Paper
Code

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

1 code implementation • 22 Sep 2017 • Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke

We extensively evaluate our approaches with a total of more than 25, 000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN.

Domain Adaptation Industrial Robots +1

Paper
Code

Understanding the World Through Action

1 code implementation • 24 Oct 2021 • Sergey Levine

The recent history of machine learning research has taught us that machine learning methods can be most effective when they are provided with very large, high-capacity models, and trained on very large and diverse datasets.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

1 code implementation • NeurIPS 2019 • Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine

In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.

Continuous Control

Paper
Code

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

1 code implementation • NeurIPS 2020 • Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov

In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks.

Reinforcement Learning (RL)

Paper
Code

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

1 code implementation • NeurIPS 2021 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.

Car Racing Decision Making +1

Paper
Code

A Workflow for Offline Model-Free Robotic Reinforcement Learning

1 code implementation • 22 Sep 2021 • Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine

To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.

Offline RL reinforcement-learning +1

Paper
Code

EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

1 code implementation • NeurIPS 2017 • Justin Fu, John D. Co-Reyes, Sergey Levine

Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes.

Density Estimation Novelty Detection +2

Paper
Code

Learning Powerful Policies by Using Consistent Dynamics Model

1 code implementation • 11 Jun 2019 • Shagun Sodhani, Anirudh Goyal, Tristan Deleu, Yoshua Bengio, Sergey Levine, Jian Tang

There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment.

Atari Games Model-based Reinforcement Learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.