Search Results for author: Sergey Levine

Found 491 papers, 228 papers with code

Time-Contrastive Networks: Self-Supervised Learning from Video

7 code implementations23 Apr 2017 Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine

While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.

Metric Learning reinforcement-learning +3

Data-Efficient Hierarchical Reinforcement Learning

12 code implementations NeurIPS 2018 Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.

Hierarchical Reinforcement Learning reinforcement-learning +1

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

2 code implementations16 Nov 2015 Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.

Unsupervised Learning for Physical Interaction through Video Prediction

3 code implementations NeurIPS 2016 Chelsea Finn, Ian Goodfellow, Sergey Levine

A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment.

Object Video Generation +1

High-Dimensional Continuous Control Using Generalized Advantage Estimation

17 code implementations8 Jun 2015 John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.

Continuous Control Policy Gradient Methods +1

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

1 code implementation NeurIPS 2019 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

We introduce a general control algorithm that combines the strengths of planning and reinforcement learning to effectively solve these tasks.

reinforcement-learning Reinforcement Learning (RL)

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

1 code implementation ICLR 2022 Mengjiao Yang, Sergey Levine, Ofir Nachum

In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.

Imitation Learning

Meta-Learning without Memorization

1 code implementation ICLR 2020 Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn

If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes.

Few-Shot Image Classification Memorization +1

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation ICLR 2022 Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

82 code implementations ICML 2017 Chelsea Finn, Pieter Abbeel, Sergey Levine

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning.

Few-Shot Image Classification General Classification +4

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

3 code implementations4 May 2020 Sergey Levine, Aviral Kumar, George Tucker, Justin Fu

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection.

Decision Making reinforcement-learning +1

Conservative Q-Learning for Offline Reinforcement Learning

17 code implementations NeurIPS 2020 Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

Continuous Control DQN Replay Dataset +3

Model-Based Reinforcement Learning for Atari

2 code implementations1 Mar 2019 Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Atari Games 100k +4

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

1 code implementation ICLR 2020 Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.

Predict Future Video Frames Video Generation

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

5 code implementations1 Oct 2019 Xue Bin Peng, Aviral Kumar, Grace Zhang, Sergey Levine

In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines.

Continuous Control OpenAI Gym +3

Trust Region Policy Optimization

21 code implementations19 Feb 2015 John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel

We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.

Atari Games Policy Gradient Methods

Reinforcement Learning with Deep Energy-Based Policies

3 code implementations ICML 2017 Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +1

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

4 code implementations1 Mar 2016 Chelsea Finn, Sergey Levine, Pieter Abbeel

We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems.

Feature Engineering

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

5 code implementations ICLR 2020 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.

Imitation Learning Q-Learning +2

When to Trust Your Model: Model-Based Policy Optimization

11 code implementations NeurIPS 2019 Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.

Model-based Reinforcement Learning reinforcement-learning +1

Offline Reinforcement Learning with Implicit Q-Learning

15 code implementations12 Oct 2021 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

Planning with Diffusion for Flexible Behavior Synthesis

2 code implementations20 May 2022 Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.

Decision Making Denoising +2

Visual Reinforcement Learning with Imagined Goals

2 code implementations NeurIPS 2018 Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine

For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.

reinforcement-learning Reinforcement Learning (RL) +1

AWAC: Accelerating Online Reinforcement Learning with Offline Datasets

6 code implementations16 Jun 2020 Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine

If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.

reinforcement-learning Reinforcement Learning (RL)

Robust Predictable Control

1 code implementation NeurIPS 2021 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression.

Computational Efficiency Decision Making +1

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

6 code implementations8 Apr 2018 Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel Van de Panne

We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills.

Motion Synthesis reinforcement-learning +1

The Variational Bandwidth Bottleneck: Stochastic Evaluation on an Information Budget

1 code implementation ICLR 2020 Anirudh Goyal, Yoshua Bengio, Matthew Botvinick, Sergey Levine

This is typically the case when we have a standard conditioning input, such as a state observation, and a "privileged" input, which might correspond to the goal of a task, the output of a costly planning algorithm, or communication with another agent.

reinforcement-learning Reinforcement Learning (RL) +1

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

3 code implementations5 Apr 2021 Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.

Imitation Learning Reinforcement Learning (RL)

Adaptive Risk Minimization: Learning to Adapt to Domain Shift

3 code implementations NeurIPS 2021 Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn

A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.

BIG-bench Machine Learning Domain Generalization +2

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

3 code implementations NeurIPS 2019 Aviral Kumar, Justin Fu, George Tucker, Sergey Levine

Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.

Continuous Control Q-Learning

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

7 code implementations15 Apr 2020 Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine

In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.

D4RL Offline RL +2

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

8 code implementations24 Oct 2019 Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine

Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.

Meta-Learning Meta Reinforcement Learning +3

The False Promise of Imitating Proprietary LLMs

1 code implementation25 May 2023 Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song

This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.

Language Modelling

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

10 code implementations NeurIPS 2018 Kurtland Chua, Roberto Calandra, Rowan Mcallister, Sergey Levine

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance.

Model-based Reinforcement Learning reinforcement-learning +1

Deep Dynamics Models for Learning Dexterous Manipulation

2 code implementations25 Sep 2019 Anusha Nagabandi, Kurt Konoglie, Sergey Levine, Vikash Kumar

Dexterous multi-fingered hands can provide robots with the ability to flexibly perform a wide range of manipulation skills.

Model Predictive Control

Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

2 code implementations2 May 2018 Sergey Levine

The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable.

Decision Making reinforcement-learning +2

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

5 code implementations ICLR 2019 Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine

By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.

Continuous Control Image Generation +1

Guided Policy Search as Approximate Mirror Descent

1 code implementation15 Jul 2016 William Montgomery, Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.

Value Iteration Networks

8 code implementations NeurIPS 2016 Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.

reinforcement-learning Reinforcement Learning (RL)

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

1 code implementation16 Nov 2018 Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine

We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin.

Object Representation Learning

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

1 code implementation ICLR 2021 Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.

reinforcement-learning Reinforcement Learning (RL)

Offline Reinforcement Learning as One Big Sequence Modeling Problem

2 code implementations NeurIPS 2021 Michael Janner, Qiyang Li, Sergey Levine

Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.

Imitation Learning Offline RL +2

Reinforcement Learning as One Big Sequence Modeling Problem

1 code implementation ICML Workshop URL 2021 Michael Janner, Qiyang Li, Sergey Levine

However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.

Imitation Learning Offline RL +2

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation19 Mar 2018 Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning +1

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

7 code implementations30 Oct 2017 Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Decision Making reinforcement-learning +1

End-to-End Robotic Reinforcement Learning without Reward Engineering

3 code implementations16 Apr 2019 Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine

In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.

reinforcement-learning Reinforcement Learning (RL)

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

8 code implementations8 Aug 2017 Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.

Model-based Reinforcement Learning Model Predictive Control +2

Training Diffusion Models with Reinforcement Learning

2 code implementations22 May 2023 Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.

Decision Making Denoising +2

OmniTact: A Multi-Directional High Resolution Touch Sensor

1 code implementation16 Mar 2020 Akhil Padmanabha, Frederik Ebert, Stephen Tian, Roberto Calandra, Chelsea Finn, Sergey Levine

We compare with a state-of-the-art tactile sensor that is only sensitive on one side, as well as a state-of-the-art multi-directional tactile sensor, and find that OmniTact's combination of high-resolution and multi-directional sensing is crucial for reliably inserting the electrical connector and allows for higher accuracy in the state estimation task.

Vocal Bursts Intensity Prediction

SFV: Reinforcement Learning of Physical Skills from Videos

1 code implementation8 Oct 2018 Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine

In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV).

Pose Estimation reinforcement-learning +1

Stochastic Adversarial Video Prediction

4 code implementations ICLR 2019 Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Representation Learning Video Generation +1

Gradient Surgery for Multi-Task Learning

9 code implementations NeurIPS 2020 Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn

While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.

Image Classification Multi-Task Learning +1

One-Shot Visual Imitation Learning via Meta-Learning

3 code implementations14 Sep 2017 Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine

In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration.

Imitation Learning Meta-Learning

Adversarial Policies: Attacking Deep Reinforcement Learning

2 code implementations ICLR 2020 Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell

Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.

reinforcement-learning Reinforcement Learning (RL)

Inter-Level Cooperation in Hierarchical Reinforcement Learning

1 code implementation5 Dec 2019 Abdul Rahman Kreidieh, Glen Berseth, Brandon Trabucco, Samyak Parajuli, Sergey Levine, Alexandre M. Bayen

This allows us to draw on connections between communication and cooperation in multi-agent RL, and demonstrate the benefits of increased cooperation between sub-policies on the training performance of the overall policy.

Hierarchical Reinforcement Learning reinforcement-learning +1

A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

3 code implementations11 Nov 2016 Chelsea Finn, Paul Christiano, Pieter Abbeel, Sergey Levine

In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator.

Imitation Learning reinforcement-learning +1

Probabilistic Model-Agnostic Meta-Learning

1 code implementation NeurIPS 2018 Chelsea Finn, Kelvin Xu, Sergey Levine

However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.

Active Learning Few-Shot Image Classification +1

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation16 Aug 2022 Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

GNM: A General Navigation Model to Drive Any Robot

1 code implementation7 Oct 2022 Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data.

COMBO: Conservative Offline Model-Based Policy Optimization

4 code implementations NeurIPS 2021 Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn

We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.

Offline RL Uncertainty Quantification

Learning from the Hindsight Plan -- Episodic MPC Improvement

1 code implementation28 Sep 2016 Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel

To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.

Model Predictive Control

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

1 code implementation12 Sep 2022 Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots.

Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning

2 code implementations ICLR 2019 Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn

Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.

Continuous Control Meta-Learning +5

Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?

2 code implementations ICML 2020 Angelos Filos, Panagiotis Tigas, Rowan Mcallister, Nicholas Rhinehart, Sergey Levine, Yarin Gal

Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment, typically leading to arbitrary deductions and poorly-informed decisions.

Autonomous Vehicles Out of Distribution (OOD) Detection

Dynamics-Aware Unsupervised Discovery of Skills

3 code implementations2 Jul 2019 Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.

Model-based Reinforcement Learning

Emergent Real-World Robotic Skills via Unsupervised Off-Policy Reinforcement Learning

2 code implementations27 Apr 2020 Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu

Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?

Model Predictive Control reinforcement-learning +2

Dynamics-Aware Unsupervised Skill Discovery

1 code implementation ICLR 2020 Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.

Model-based Reinforcement Learning

Self-Supervised Visual Planning with Temporal Skip Connections

3 code implementations15 Oct 2017 Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine

One learning signal that is always available for autonomously collected data is prediction: if a robot can learn to predict the future, it can use this predictive model to take actions to produce desired outcomes, such as moving an object to a particular location.

Video Prediction

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

1 code implementation10 Jul 2022 Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine

Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings.

Instruction Following Language Modelling

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

2 code implementations7 Nov 2016 Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine

We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.

Continuous Control Policy Gradient Methods +2

Few-Shot Segmentation Propagation with Guided Networks

1 code implementation25 May 2018 Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine

Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.

Interactive Segmentation Segmentation +3

BADGR: An Autonomous Self-Supervised Learning-Based Navigation System

1 code implementation13 Feb 2020 Gregory Kahn, Pieter Abbeel, Sergey Levine

Mobile robot navigation is typically regarded as a geometric problem, in which the robot's objective is to perceive the geometry of the environment in order to plan collision-free paths towards a desired goal.

Navigate Robot Navigation +1

Learning Invariant Representations for Reinforcement Learning without Reconstruction

2 code implementations18 Jun 2020 Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey Levine

We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.

Causal Inference reinforcement-learning +2

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

1 code implementation3 Dec 2018 Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains.

reinforcement-learning Reinforcement Learning (RL)

Meta-Learning with Implicit Gradients

6 code implementations NeurIPS 2019 Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine

By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.

Few-Shot Image Classification Few-Shot Learning

Recurrent Independent Mechanisms

3 code implementations ICLR 2021 Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf

Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes.

Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

1 code implementation8 Dec 2020 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan

In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

2 code implementations29 Sep 2017 Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine

To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.

Navigate Q-Learning +3

Multimodal Masked Autoencoders Learn Transferable Representations

1 code implementation27 May 2022 Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel

We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.

Contrastive Learning

PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings

2 code implementations ICCV 2019 Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine

For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.

Autonomous Vehicles

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

3 code implementations NeurIPS 2020 Aviral Kumar, Abhishek Gupta, Sergey Levine

We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.

Meta-Learning Multi-Task Learning +3

Continuous Deep Q-Learning with Model-based Acceleration

8 code implementations2 Mar 2016 Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.

Continuous Control Q-Learning +2

Learning Human Objectives by Evaluating Hypothetical Behavior

1 code implementation ICML 2020 Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike

To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.

Car Racing

Benchmarks for Deep Off-Policy Evaluation

3 code implementations ICLR 2021 Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Benchmarking Continuous Control +3

Reinforcement Learning from Passive Data via Latent Intentions

1 code implementation10 Apr 2023 Dibya Ghosh, Chethan Bhateja, Sergey Levine

Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.

reinforcement-learning Value prediction

Learning Instance Segmentation by Interaction

1 code implementation21 Jun 2018 Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik

The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.

Instance Segmentation Segmentation +1

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

3 code implementations17 Feb 2022 Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine

To address this, we present Design-Bench, a benchmark for offline MBO with a unified evaluation protocol and reference implementations of recent methods.

Adversarial Policies Beat Superhuman Go AIs

2 code implementations1 Nov 2022 Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack.

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation20 Apr 2023 Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations ICLR 2021 Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning Reinforcement Learning (RL)

Shared Autonomy via Deep Reinforcement Learning

1 code implementation6 Feb 2018 Siddharth Reddy, Anca D. Dragan, Sergey Levine

In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.

reinforcement-learning Reinforcement Learning (RL)

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning

1 code implementation ICLR 2019 Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J. Johnson, Sergey Levine

Model-based reinforcement learning (RL) has proven to be a data efficient approach for learning control tasks but is difficult to utilize in domains with complex observations such as images.

Model-based Reinforcement Learning reinforcement-learning +1

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation20 Dec 2021 Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Efficient Exploration via State Marginal Matching

1 code implementation12 Jun 2019 Lisa Lee, Benjamin Eysenbach, Emilio Parisotto, Eric Xing, Sergey Levine, Ruslan Salakhutdinov

The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.

Efficient Exploration Unsupervised Reinforcement Learning

FitVid: Overfitting in Pixel-Level Video Prediction

1 code implementation24 Jun 2021 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.

Image Augmentation Video Generation +1

MELD: Meta-Reinforcement Learning from Images via Latent State Models

1 code implementation26 Oct 2020 Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine

Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.

Meta-Learning Meta Reinforcement Learning +3

Divide-and-Conquer Reinforcement Learning

1 code implementation ICLR 2018 Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine

In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.

Policy Gradient Methods reinforcement-learning +1

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

2 code implementations NeurIPS 2023 Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

HIQL: Offline Goal-Conditioned RL with Latent States as Actions

1 code implementation NeurIPS 2023 Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine

This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.

Reinforcement Learning (RL) Unsupervised Pre-training

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation ICLR 2022 Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

2 code implementations NAACL 2022 Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +2

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

1 code implementation29 Feb 2024 Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar

In this paper, we develop a framework for building multi-turn RL algorithms for fine-tuning LLMs, that preserves the flexibility of existing single-turn RL methods for LLMs (e. g., proximal policy optimization), while accommodating multiple turns, long horizons, and delayed rewards effectively.

Language Modelling Reinforcement Learning (RL)

Entity Abstraction in Visual Model-Based Reinforcement Learning

1 code implementation28 Oct 2019 Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine

This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before.

Model-based Reinforcement Learning Object +5

Learning to Walk in the Real World with Minimal Human Effort

1 code implementation20 Feb 2020 Sehoon Ha, Peng Xu, Zhenyu Tan, Sergey Levine, Jie Tan

In this paper, we develop a system for learning legged locomotion policies with deep RL in the real world with minimal human effort.

Multi-Task Learning

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

1 code implementation30 Nov 2023 Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine

Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.

reinforcement-learning Text Generation

Foundation Policies with Hilbert Representations

1 code implementation23 Feb 2024 Seohong Park, Tobias Kreiman, Sergey Levine

While a number of methods have been proposed to enable generic self-supervised RL, based on principles such as goal-conditioned RL, behavioral cloning, and unsupervised skill learning, such methods remain limited in terms of either the diversity of the discovered behaviors, the need for high-quality demonstration data, or the lack of a clear prompting or adaptation mechanism for downstream tasks.

Reinforcement Learning (RL) Unsupervised Pre-training

Learning with Latent Language

1 code implementation NAACL 2018 Jacob Andreas, Dan Klein, Sergey Levine

The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world.

Image Classification Navigate

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

1 code implementation11 Jul 2017 YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.

Imitation Learning Translation +1

Conservative Objective Models for Effective Offline Model-Based Optimization

2 code implementations14 Jul 2021 Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

Computational design problems arise in a number of settings, from synthetic biology to computer architectures.

Planning with Goal-Conditioned Policies

1 code implementation NeurIPS 2019 Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine

Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors.

Decision Making reinforcement-learning +3

MEMO: Test Time Robustness via Adaptation and Augmentation

2 code implementations18 Oct 2021 Marvin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Test-time Adaptation

Evolving Reinforcement Learning Algorithms

5 code implementations ICLR 2021 John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.

Atari Games Meta-Learning +2

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

1 code implementation10 Jul 2017 Rouhollah Rahmatizadeh, Pooya Abolghasemi, Ladislau Bölöni, Sergey Levine

We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation.

Multi-Task Learning Position

Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors

1 code implementation NeurIPS 2020 Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine

In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations.

Offline Meta-Reinforcement Learning with Advantage Weighting

2 code implementations13 Aug 2020 Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn

That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks in order to adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task.

Machine Translation Meta-Learning +5

Causal Confusion in Imitation Learning

2 code implementations NeurIPS 2019 Pim de Haan, Dinesh Jayaraman, Sergey Levine

Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment.

Imitation Learning

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

1 code implementation27 Feb 2024 Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine

Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner?

Offline RL reinforcement-learning

LaND: Learning to Navigate from Disengagements

1 code implementation9 Oct 2020 Gregory Kahn, Pieter Abbeel, Sergey Levine

However, we believe that these disengagements not only show where the system fails, which is useful for troubleshooting, but also provide a direct learning signal by which the robot can learn to navigate.

Autonomous Navigation Imitation Learning +3

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

1 code implementation13 Oct 2023 Seohong Park, Oleh Rybkin, Sergey Levine

Through our experiments in five locomotion and manipulation environments, we demonstrate that METRA can discover a variety of useful behaviors even in complex, pixel-based environments, being the first unsupervised RL method that discovers diverse locomotion behaviors in pixel-based Quadruped and Humanoid.

Reinforcement Learning (RL) Unsupervised Pre-training +1

Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation27 Oct 2020 Michael Janner, Igor Mordatch, Sergey Levine

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.

Generative Adversarial Network

EMI: Exploration with Mutual Information

1 code implementation2 Oct 2018 Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Reinforcement learning algorithms struggle when the reward signal is very sparse.

Continuous Control Reinforcement Learning (RL)

Autonomous Evaluation and Refinement of Digital Agents

1 code implementation9 Apr 2024 Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr

We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control.

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations ICLR 2022 Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

Universal Planning Networks

1 code implementation2 Apr 2018 Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.

Imitation Learning Representation Learning +1

Learning Visual Servoing with Deep Features and Fitted Q-Iteration

2 code implementations31 Mar 2017 Alex X. Lee, Sergey Levine, Pieter Abbeel

Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.

reinforcement-learning Reinforcement Learning (RL)

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

1 code implementation16 Oct 2018 Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine

We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks.

Robot Navigation

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control

1 code implementation ICML 2018 Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.

Imitation Learning

COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning

1 code implementation27 Oct 2020 Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine

Reinforcement learning has been applied to a wide variety of robotics problems, but most of such applications involve collecting data from scratch for each new task.

reinforcement-learning Reinforcement Learning (RL)

Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

1 code implementation21 Apr 2021 Nicholas Rhinehart, Jeff He, Charles Packer, Matthew A. Wright, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine

Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents.

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation24 Jun 2021 Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

Stochastic Variational Video Prediction

3 code implementations ICLR 2018 Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine

We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods.

Video Generation Video Prediction

Simple and Effective VAE Training with Calibrated Decoders

1 code implementation23 Jun 2020 Oleh Rybkin, Kostas Daniilidis, Sergey Levine

We perform the first comprehensive comparative analysis of calibrated decoder and provide recommendations for simple and effective VAE training.

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

1 code implementation11 Oct 2022 Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine

To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.

Offline RL Q-Learning +1

Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

1 code implementation NeurIPS 2018 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Predictable MDP Abstraction for Unsupervised Model-Based RL

2 code implementations8 Feb 2023 Seohong Park, Sergey Levine

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Automatically Composing Representation Transformations as a Means for Generalization

1 code implementation ICLR 2019 Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.

Decision Making

Model-Based Meta-Reinforcement Learning for Flight with Suspended Payloads

2 code implementations23 Apr 2020 Suneel Belkhale, Rachel Li, Gregory Kahn, Rowan Mcallister, Roberto Calandra, Sergey Levine

Our experiments demonstrate that our online adaptation approach outperforms non-adaptive methods on a series of challenging suspended payload transportation tasks.

Meta-Learning Meta Reinforcement Learning +2

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

1 code implementation24 May 2022 Siddharth Reddy, Sergey Levine, Anca D. Dragan

How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?

Mismatched No More: Joint Model-Policy Optimization for Model-Based RL

1 code implementation6 Oct 2021 Benjamin Eysenbach, Alexander Khazatsky, Sergey Levine, Ruslan Salakhutdinov

Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

The Information Geometry of Unsupervised Reinforcement Learning

1 code implementation ICLR 2022 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function.

Contrastive Learning reinforcement-learning +3

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

1 code implementation25 Oct 2019 Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman

We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.

Imitation Learning reinforcement-learning +1

Deep Object-Centric Representations for Generalizable Robot Learning

1 code implementation14 Aug 2017 Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine

We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.

Object Reinforcement Learning (RL)

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

1 code implementation10 Nov 2020 Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.

Continual Learning

Model-Based Visual Planning with Self-Supervised Functional Distances

1 code implementation ICLR 2021 Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot.

reinforcement-learning Reinforcement Learning (RL)

Offline Reinforcement Learning for Visual Navigation

1 code implementation16 Dec 2022 Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

Accelerating Exploration with Unlabeled Prior Data

1 code implementation NeurIPS 2023 Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine

Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.

Reinforcement Learning (RL)

Diagnosing Bottlenecks in Deep Q-learning Algorithms

1 code implementation26 Feb 2019 Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).

Continuous Control Q-Learning +2

The Mirage of Action-Dependent Baselines in Reinforcement Learning

1 code implementation ICML 2018 George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine

Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.

Policy Gradient Methods reinforcement-learning +1

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

1 code implementation9 Aug 2022 Marwa Abdulhai, Natasha Jaques, Sergey Levine

IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.

reinforcement-learning Reinforcement Learning (RL)

Contextual Imagined Goals for Self-Supervised Robotic Learning

1 code implementation23 Oct 2019 Ashvin Nair, Shikhar Bahl, Alexander Khazatsky, Vitchyr Pong, Glen Berseth, Sergey Levine

When the robot's environment and available objects vary, as they do in most open-world settings, the robot must propose to itself only those goals that it can accomplish in its present setting with the objects that are at hand.

reinforcement-learning Reinforcement Learning (RL)

Reward-Conditioned Policies

1 code implementation31 Dec 2019 Aviral Kumar, Xue Bin Peng, Sergey Levine

By then conditioning the policy on the numerical value of the reward, we can obtain a policy that generalizes to larger returns.

Imitation Learning reinforcement-learning +1

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

1 code implementation ICLR 2018 Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine

In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.

reinforcement-learning Reinforcement Learning (RL)

Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning

3 code implementations6 Oct 2018 Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn

We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation.

Image Registration Self-Supervised Learning +1

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation ICLR 2022 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

What Can I Do Here? Learning New Skills by Imagining Visual Affordances

2 code implementations1 Jun 2021 Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine

In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.

Zero-shot Generalization

Deep Visual Foresight for Planning Robot Motion

1 code implementation3 Oct 2016 Chelsea Finn, Sergey Levine

A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback.

Model-based Reinforcement Learning Model Predictive Control +2

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

1 code implementation ICML 2020 Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman

Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment.

reinforcement-learning Reinforcement Learning (RL)

Offline Meta-Reinforcement Learning with Online Self-Supervision

1 code implementation8 Jul 2021 Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.

Meta Reinforcement Learning Offline RL +2

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data

1 code implementation6 Jun 2023 Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey Levine

Robotic systems that rely primarily on self-supervised learning have the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.

Contrastive Learning Data Augmentation +2

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

1 code implementation22 Sep 2017 Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke

We extensively evaluate our approaches with a total of more than 25, 000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN.

Domain Adaptation Industrial Robots +1

Understanding the World Through Action

1 code implementation24 Oct 2021 Sergey Levine

The recent history of machine learning research has taught us that machine learning methods can be most effective when they are provided with very large, high-capacity models, and trained on very large and diverse datasets.

reinforcement-learning Reinforcement Learning (RL)

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

1 code implementation NeurIPS 2019 Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine

In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.

Continuous Control

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

1 code implementation NeurIPS 2020 Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov

In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks.

Reinforcement Learning (RL)

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

1 code implementation NeurIPS 2021 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.

Car Racing Decision Making +1

A Workflow for Offline Model-Free Robotic Reinforcement Learning

1 code implementation22 Sep 2021 Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine

To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.

Offline RL reinforcement-learning +1

Learning Powerful Policies by Using Consistent Dynamics Model

1 code implementation11 Jun 2019 Shagun Sodhani, Anirudh Goyal, Tristan Deleu, Yoshua Bengio, Sergey Levine, Jian Tang

There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment.

Atari Games Model-based Reinforcement Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.