Search Results for author: Sergey Levine

Found 436 papers, 204 papers with code

Global Decision-Making via Local Economic Transactions

no code implementations ICML 2020 Michael Chang, Sid Kaushik, S. Matthew Weinberg, Sergey Levine, Thomas Griffiths

This paper seeks to establish a mechanism for directing a collection of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems with a central global objective.

Decision Making

Neural Constraint Satisfaction: Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

no code implementations20 Mar 2023 Michael Chang, Alyssa L. Dayan, Franziska Meier, Thomas L. Griffiths, Sergey Levine, Amy Zhang

Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of configurations of entities and their locations.

Ignorance is Bliss: Robust Control via Information Gating

no code implementations10 Mar 2023 Manan Tomar, Riashat Islam, Sergey Levine, Philip Bachman

Informational parsimony -- i. e., using the minimal information required for a task, -- provides a useful inductive bias for learning representations that achieve better generalization by being robust to noise and spurious correlations.

Inductive Bias Q-Learning

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

no code implementations9 Mar 2023 Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL) accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

Learning to Influence Human Behavior with Offline Reinforcement Learning

no code implementations3 Mar 2023 Joey Hong, Anca Dragan, Sergey Levine

Our goal in this work is to enable agents to leverage that influence to improve the human's performance in collaborative tasks, as the task unfolds.

Offline RL reinforcement-learning +1

Grounded Decoding: Guiding Text Generation with Grounded Models for Robot Control

no code implementations1 Mar 2023 Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter

Recent progress in large language models (LLMs) has demonstrated the ability to learn and leverage Internet-scale knowledge through pre-training with autoregressive models.

Language Modelling Text Generation

Robust and Versatile Bipedal Jumping Control through Multi-Task Reinforcement Learning

no code implementations19 Feb 2023 Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

Such robustness in the proposed multi-task policy enables Cassie to succeed in completing a variety of challenging jump tasks in the real world, such as standing long jumps, jumping onto elevated platforms, and multi-axis jumps.

reinforcement-learning Reinforcement Learning (RL)

Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

no code implementations10 Feb 2023 Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

We propose a lightweight, sample-efficient approach that learns a diverse set of features and adapts to a target distribution by interpolating these features with a small target dataset.

Domain Adaptation

Predictable MDP Abstraction for Unsupervised Model-Based RL

1 code implementation8 Feb 2023 Seohong Park, Sergey Levine

A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

1 code implementation6 Feb 2023 Amrith Setlur, Don Dennis, Benjamin Eysenbach, aditi raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine

Some robust training algorithms (e. g., Group DRO) specialize to group shifts and require group information on all training points.

Understanding the Complexity Gains of Single-Task RL with a Curriculum

no code implementations24 Dec 2022 Qiyang Li, Yuexiang Zhai, Yi Ma, Sergey Levine

Under mild regularity conditions on the curriculum, we show that sequentially solving each task in the multi-task RL problem is more computationally efficient than solving the original single-task problem, without any explicit exploration bonuses or other exploration strategies.

Reinforcement Learning (RL)

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios

no code implementations21 Dec 2022 Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Becca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine

In this paper, we show how imitation learning combined with reinforcement learning using simple rewards can substantially improve the safety and reliability of driving policies over those learned from imitation alone.

Autonomous Driving Imitation Learning +2

Dexterous Manipulation from Images: Autonomous Real-World RL via Substep Guidance

no code implementations19 Dec 2022 Kelvin Xu, Zheyuan Hu, Ria Doshi, Aaron Rovinsky, Vikash Kumar, Abhishek Gupta, Sergey Levine

In this paper, we describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks and enable robots with complex multi-fingered hands to learn to perform them through interaction.

reinforcement-learning Reinforcement Learning (RL)

Offline Reinforcement Learning for Visual Navigation

1 code implementation16 Dec 2022 Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

Learning Robotic Navigation from Experience: Principles, Methods, and Recent Results

no code implementations13 Dec 2022 Sergey Levine, Dhruv Shah

Navigation is one of the most heavily studied problems in robotics, and is conventionally approached as a geometric mapping and planning problem.

Confidence-Conditioned Value Functions for Offline Reinforcement Learning

no code implementations8 Dec 2022 Joey Hong, Aviral Kumar, Sergey Levine

To do so, in this work, we propose learning value functions that additionally condition on the degree of conservatism, which we dub confidence-conditioned value functions.

Offline RL reinforcement-learning +1

Multi-Task Imitation Learning for Linear Dynamical Systems

no code implementations1 Dec 2022 Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.

Imitation Learning Representation Learning

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

no code implementations28 Nov 2022 Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine

The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP.

Offline RL Q-Learning +2

Data-Driven Offline Decision-Making via Invariant Representation Learning

no code implementations21 Nov 2022 Han Qi, Yi Su, Aviral Kumar, Sergey Levine

The goal in offline data-driven decision-making is synthesize decisions that optimize a black-box utility function, using a previously-collected static dataset, with no active interaction.

Decision Making Domain Adaptation +2

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

no code implementations21 Nov 2022 Ted Xiao, Harris Chan, Pierre Sermanet, Ayzaan Wahid, Anthony Brohan, Karol Hausman, Sergey Levine, Jonathan Tompson

To accomplish this, we introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL): we utilize semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets.

Imitation Learning

Dual Generator Offline Reinforcement Learning

no code implementations2 Nov 2022 Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar

In this paper, we show that the issue of conflicting objectives can be resolved by training two generators: one that maximizes return, with the other capturing the ``remainder'' of the data distribution in the offline dataset, such that the mixture of the two is close to the behavior policy.

Offline RL reinforcement-learning +1

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

no code implementations2 Nov 2022 Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine

Both theoretically and empirically, we show that typical offline RL methods, which are based on distribution constraints fail to learn from data with such non-uniform variability, due to the requirement to stay close to the behavior policy to the same extent across the state space.

Atari Games Offline RL +2

Adversarial Policies Beat Superhuman Go AIs

2 code implementations1 Nov 2022 Tony T. Wang, Adam Gleave, Tom Tseng, Nora Belrose, Kellin Pelrine, Joseph Miller, Michael D Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies that play against frozen KataGo victims.

Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision

no code implementations27 Oct 2022 Ashvin Nair, Brian Zhu, Gokul Narayanan, Eugen Solowjow, Sergey Levine

One of the main observations we make in this work is that, with a suitable representation learning and domain generalization approach, it can be significantly easier for the reward function to generalize to a new but structurally similar task (e. g., inserting a new type of connector) than for the policy.

Domain Generalization Representation Learning

Towards Better Few-Shot and Finetuning Performance with Forgetful Causal Language Models

no code implementations24 Oct 2022 Hao liu, Xinyang Geng, Lisa Lee, Igor Mordatch, Sergey Levine, Sharan Narang, Pieter Abbeel

Large language models (LLM) trained using the next-token-prediction objective, such as GPT3 and PaLM, have revolutionized natural language processing in recent years by showing impressive zero-shot and few-shot capabilities across a wide range of tasks.

Language Modelling Natural Language Inference +1

Unpacking Reward Shaping: Understanding the Benefits of Reward Engineering on Sample Complexity

no code implementations18 Oct 2022 Abhishek Gupta, Aldo Pacchiano, Yuexiang Zhai, Sham M. Kakade, Sergey Levine

Reinforcement learning provides an automated framework for learning behaviors from high-level reward specifications, but in practice the choice of reward function can be crucial for good results -- while in principle the reward only needs to specify what the task is, in reality practitioners often need to design more detailed rewards that provide the agent with some hints about how the task should be completed.

reinforcement-learning Reinforcement Learning (RL)

You Only Live Once: Single-Life Reinforcement Learning

no code implementations17 Oct 2022 Annie S. Chen, Archit Sharma, Sergey Levine, Chelsea Finn

We formalize this problem setting, which we call single-life reinforcement learning (SLRL), where an agent must complete a task within a single episode without interventions, utilizing its prior experience while contending with some form of novelty.

Continuous Control reinforcement-learning +1

Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks

no code implementations12 Oct 2022 Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey Levine

The utilization of broad datasets has proven to be crucial for generalization for a wide range of fields.

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

1 code implementation11 Oct 2022 Aviral Kumar, Anikait Singh, Frederik Ebert, Yanlai Yang, Chelsea Finn, Sergey Levine

To the best of our knowledge, PTR is the first offline RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.

Offline RL Q-Learning +1

GNM: A General Navigation Model to Drive Any Robot

1 code implementation7 Oct 2022 Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine

Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data.

Distributionally Adaptive Meta Reinforcement Learning

no code implementations6 Oct 2022 Anurag Ajay, Abhishek Gupta, Dibya Ghosh, Sergey Levine, Pulkit Agrawal

In this work, we develop a framework for meta-RL algorithms that are able to behave appropriately under test-time distribution shifts in the space of tasks.

Meta Reinforcement Learning reinforcement-learning +1

Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective

no code implementations18 Sep 2022 Raj Ghugare, Homanga Bharadhwaj, Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov

In this work, we propose a single objective which jointly optimizes a latent-space model and policy to achieve high returns while remaining self-consistent.

Reinforcement Learning (RL) Value prediction

GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

1 code implementation12 Sep 2022 Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots.

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation16 Aug 2022 Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

1 code implementation9 Aug 2022 Marwa Abdulhai, Natasha Jaques, Sergey Levine

IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.

reinforcement-learning Reinforcement Learning (RL)

Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot

no code implementations1 Aug 2022 Yandong Ji, Zhongyu Li, Yinan Sun, Xue Bin Peng, Sergey Levine, Glen Berseth, Koushil Sreenath

Developing algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task.

Friction Hierarchical Reinforcement Learning +3

LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action

1 code implementation10 Jul 2022 Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine

Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings.

Association Instruction Following +1

Offline RL Policies Should be Trained to be Adaptive

no code implementations5 Jul 2022 Dibya Ghosh, Anurag Ajay, Pulkit Agrawal, Sergey Levine

Offline RL algorithms must account for the fact that the dataset they are provided may leave many facets of the environment unknown.

Offline RL

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

1 code implementation2 Jul 2022 Michael Chang, Thomas L. Griffiths, Sergey Levine

Iterative refinement -- start with a random guess, then iteratively improve the guess -- is a useful paradigm for representation learning because it offers a way to break symmetries among equally plausible explanations for the data.

Representation Learning

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

no code implementations21 Jun 2022 Katie Kang, Paula Gradu, Jason Choi, Michael Janner, Claire Tomlin, Sergey Levine

Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs.

Density Estimation

Contrastive Learning as Goal-Conditioned Reinforcement Learning

no code implementations15 Jun 2022 Benjamin Eysenbach, Tianjun Zhang, Ruslan Salakhutdinov, Sergey Levine

While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable and instead equip RL algorithms with additional representation learning parts (e. g., auxiliary losses, data augmentation).

Contrastive Learning Data Augmentation +4

Imitating Past Successes can be Very Suboptimal

no code implementations7 Jun 2022 Benjamin Eysenbach, Soumith Udatha, Sergey Levine, Ruslan Salakhutdinov

Prior work has proposed a simple strategy for reinforcement learning (RL): label experience with the outcomes achieved in that experience, and then imitate the relabeled experience.

Imitation Learning Reinforcement Learning (RL)

Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

no code implementations3 Jun 2022 Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

Supervised learning methods trained with maximum likelihood objectives often overfit on training data.

Data Augmentation

Multimodal Masked Autoencoders Learn Transferable Representations

1 code implementation27 May 2022 Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel

We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.

Contrastive Learning

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

1 code implementation24 May 2022 Siddharth Reddy, Sergey Levine, Anca D. Dragan

How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?

Planning with Diffusion for Flexible Behavior Synthesis

1 code implementation20 May 2022 Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.

Decision Making Denoising +2

Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space

no code implementations17 May 2022 Kuan Fang, Patrick Yin, Ashvin Nair, Sergey Levine

Our experimental results show that PTP can generate feasible sequences of subgoals that enable the policy to efficiently solve the target tasks.

reinforcement-learning Reinforcement Learning (RL)

ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters

no code implementations4 May 2022 Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, Sanja Fidler

By leveraging a massively parallel GPU-based simulator, we are able to train skill embeddings using over a decade of simulated experiences, enabling our model to learn a rich and versatile repertoire of skills.

Imitation Learning Unsupervised Reinforcement Learning

Control-Aware Prediction Objectives for Autonomous Driving

no code implementations28 Apr 2022 Rowan Mcallister, Blake Wulfe, Jean Mercat, Logan Ellis, Sergey Levine, Adrien Gaidon

Autonomous vehicle software is typically structured as a modular pipeline of individual components (e. g., perception, prediction, and planning) to help separate concerns into interpretable sub-tasks.

Autonomous Driving Trajectory Prediction

Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning

no code implementations27 Apr 2022 Philippe Hansen-Estruch, Amy Zhang, Ashvin Nair, Patrick Yin, Sergey Levine

We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.

reinforcement-learning Reinforcement Learning (RL)

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

1 code implementation NAACL 2022 Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +2

INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL

no code implementations ICLR 2022 Homanga Bharadhwaj, Mohammad Babaeizadeh, Dumitru Erhan, Sergey Levine

We propose a modified objective for model-based RL that, in combination with mutual information maximization, allows us to learn representations and dynamics for visual model-based RL without reconstruction in a way that explicitly prioritizes functionally relevant factors.

Model-based Reinforcement Learning Reinforcement Learning (RL)

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

no code implementations12 Apr 2022 Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

To answer this question, we characterize the properties of environments that allow offline RL methods to perform better than BC methods, even when only provided with expert data.

Atari Games Imitation Learning +3

Jump-Start Reinforcement Learning

no code implementations5 Apr 2022 Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, Sergey Levine, Karol Hausman

In addition, we provide an upper bound on the sample complexity of JSRL and show that with the help of a guide-policy, one can improve the sample complexity for non-optimism exploration methods from exponential in horizon to polynomial.

reinforcement-learning Reinforcement Learning (RL)

ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints

no code implementations23 Feb 2022 Dhruv Shah, Sergey Levine

In this work, we propose an approach that integrates learning and planning, and can utilize side information such as schematic roadmaps, satellite maps and GPS coordinates as a planning heuristic, without relying on them being accurate.

3D Reconstruction General Knowledge +1

Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

3 code implementations17 Feb 2022 Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine

To address this, we present Design-Bench, a benchmark for offline MBO with a unified evaluation protocol and reference implementations of recent methods.

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning

no code implementations5 Feb 2022 Sean Chen, Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine

Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e. g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface.

reinforcement-learning Reinforcement Learning (RL)

BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning

no code implementations4 Feb 2022 Eric Jang, Alex Irpan, Mohi Khansari, Daniel Kappler, Frederik Ebert, Corey Lynch, Sergey Levine, Chelsea Finn

In this paper, we study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning.

Imitation Learning

Fully Online Meta-Learning Without Task Boundaries

no code implementations1 Feb 2022 Jathushan Rajasegaran, Chelsea Finn, Sergey Levine

In this paper, we study how meta-learning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future.

Meta-Learning

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation20 Dec 2021 Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Autonomous Reinforcement Learning: Formalism and Benchmarking

2 code implementations ICLR 2022 Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn

In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.

Benchmarking reinforcement-learning +1

DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization

no code implementations ICLR 2022 Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron Courville, George Tucker, Sergey Levine

In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations.

Atari Games D4RL +3

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation ICLR 2022 Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

CoMPS: Continual Meta Policy Search

no code implementations ICLR 2022 Glen Berseth, Zhiwei Zhang, Grace Zhang, Chelsea Finn, Sergey Levine

Beyond simply transferring past experience to new tasks, our goal is to devise continual reinforcement learning algorithms that learn to learn, using their experience on previous tasks to learn new tasks more quickly.

Continual Learning Continuous Control +5

Information is Power: Intrinsic Control via Information Capture

no code implementations NeurIPS 2021 Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.

Bayesian Adaptation for Covariate Shift

no code implementations NeurIPS 2021 Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. While improving the robustness of neural networks is one promising approach to mitigate this issue, an appealing alternate to robustifying networks against all possible test-time shifts is to instead directly adapt them to unlabeled inputs from the particular distribution shift we encounter at test time. However, this poses a challenging question: in the standard Bayesian model for supervised learning, unlabeled inputs are conditionally independent of model parameters when the labels are unobserved, so what can unlabeled data tell us about the model parameters at test-time?

Domain Adaptation Image Classification

TRAIL: Near-Optimal Imitation Learning with Suboptimal Data

1 code implementation ICLR 2022 Mengjiao Yang, Sergey Levine, Ofir Nachum

In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.

Imitation Learning

Understanding the World Through Action

1 code implementation24 Oct 2021 Sergey Levine

The recent history of machine learning research has taught us that machine learning methods can be most effective when they are provided with very large, high-capacity models, and trained on very large and diverse datasets.

reinforcement-learning Reinforcement Learning (RL)

C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks

no code implementations ICLR 2022 Tianjun Zhang, Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine, Joseph E. Gonzalez

Goal-conditioned reinforcement learning (RL) can solve tasks in a wide range of domains, including navigation and manipulation, but learning to reach distant goals remains a central challenge to the field.

Reinforcement Learning (RL)

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation ICLR 2022 Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

MEMO: Test Time Robustness via Adaptation and Augmentation

2 code implementations18 Oct 2021 Marvin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Offline Reinforcement Learning with Implicit Q-Learning

9 code implementations12 Oct 2021 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

Mismatched No More: Joint Model-Policy Optimization for Model-Based RL

1 code implementation6 Oct 2021 Benjamin Eysenbach, Alexander Khazatsky, Sergey Levine, Ruslan Salakhutdinov

Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

The Information Geometry of Unsupervised Reinforcement Learning

1 code implementation ICLR 2022 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function.

Contrastive Learning reinforcement-learning +3

Should I Run Offline Reinforcement Learning or Behavioral Cloning?

no code implementations ICLR 2022 Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

In this paper, our goal is to characterize environments and dataset compositions where offline RL leads to better performance than BC.

Atari Games Offline RL +3

Data Sharing without Rewards in Multi-Task Offline Reinforcement Learning

no code implementations29 Sep 2021 Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Chelsea Finn, Sergey Levine, Karol Hausman

However, these benefits come at a cost -- for data to be shared between tasks, each transition must be annotated with reward labels corresponding to other tasks.

Multi-Task Learning Offline RL +2

Test Time Robustification of Deep Models via Adaptation and Augmentation

no code implementations29 Sep 2021 Marvin Mengxin Zhang, Sergey Levine, Chelsea Finn

We study the problem of test time robustification, i. e., using the test input to improve model robustness.

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation ICLR 2022 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

The Essential Elements of Offline RL via Supervised Learning

no code implementations ICLR 2022 Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

These methods, which we collectively refer to as reinforcement learning via supervised learning (RvS), involve a number of design decisions, such as policy architectures and how the conditioning variable is constructed.

Offline RL reinforcement-learning +1

FitVid: High-Capacity Pixel-Level Video Prediction

no code implementations29 Sep 2021 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

Furthermore, such an agent can internally represent the complex dynamics of the real-world and therefore can acquire a representation useful for a variety of visual perception tasks.

Image Augmentation Video Prediction

Training on Test Data with Bayesian Adaptation for Covariate Shift

no code implementations27 Sep 2021 Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.

Domain Adaptation Image Classification

A Workflow for Offline Model-Free Robotic Reinforcement Learning

1 code implementation22 Sep 2021 Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine

To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.

Offline RL reinforcement-learning +1

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

no code implementations NeurIPS 2021 Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Sergey Levine, Chelsea Finn

We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks, and utilize all of this data to learn behaviors for all the tasks more effectively rather than training each one in isolation.

Offline RL reinforcement-learning +1

Robust Predictable Control

1 code implementation NeurIPS 2021 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression.

Decision Making Reinforcement Learning (RL)

Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation

no code implementations28 Jul 2021 Charles Sun, Jędrzej Orbik, Coline Devin, Brian Yang, Abhishek Gupta, Glen Berseth, Sergey Levine

Our aim is to devise a robotic reinforcement learning system for learning navigation and manipulation together, in an autonomous way without human intervention, enabling continual learning under realistic assumptions.

Continual Learning Navigate +2

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

no code implementations15 Jul 2021 Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.

Meta-Learning reinforcement-learning +1

Conservative Objective Models for Effective Offline Model-Based Optimization

2 code implementations14 Jul 2021 Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

Computational design problems arise in a number of settings, from synthetic biology to computer architectures.

Offline Meta-Reinforcement Learning with Online Self-Supervision

1 code implementation8 Jul 2021 Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine

If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.

Meta Reinforcement Learning Offline RL +2

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

1 code implementation NeurIPS 2021 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.

Car Racing Decision Making +1

Modularity in Reinforcement Learning via Algorithmic Independence in Credit Assignment

no code implementations ICLR Workshop Learning_to_Learn 2021 Michael Chang, Sidhant Kaushik, Sergey Levine, Thomas L. Griffiths

Empirical evidence suggests that such action-value methods are more sample efficient than policy-gradient methods on transfer problems that require only sparse changes to a sequence of previously optimal decisions.

Decision Making Policy Gradient Methods +2

Model-Based Reinforcement Learning via Latent-Space Collocation

1 code implementation24 Jun 2021 Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.

Model-based Reinforcement Learning reinforcement-learning +1

FitVid: Overfitting in Pixel-Level Video Prediction

1 code implementation24 Jun 2021 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan

There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.

Image Augmentation Video Generation +1

Hierarchically Integrated Models: Learning to Navigate from Heterogeneous Robots

no code implementations24 Jun 2021 Katie Kang, Gregory Kahn, Sergey Levine

In this work, we propose a deep reinforcement learning algorithm with hierarchically integrated models (HInt).

Navigate reinforcement-learning +1

Reinforcement Learning as One Big Sequence Modeling Problem

1 code implementation ICML Workshop URL 2021 Michael Janner, Qiyang Li, Sergey Levine

However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.

Imitation Learning Offline RL +2

Intrinsic Control of Variational Beliefs in Dynamic Partially-Observed Visual Environments

no code implementations ICML Workshop URL 2021 Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine

We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.

Offline Reinforcement Learning as One Big Sequence Modeling Problem

2 code implementations NeurIPS 2021 Michael Janner, Qiyang Li, Sergey Levine

Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.

Imitation Learning Offline RL +2

Variational Empowerment as Representation Learning for Goal-Based Reinforcement Learning

no code implementations2 Jun 2021 Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Shane Gu

Learning to reach goal states and learning diverse skills through mutual information (MI) maximization have been proposed as principled frameworks for self-supervised reinforcement learning, allowing agents to acquire broadly applicable multitask policies with minimal reward engineering.

reinforcement-learning Reinforcement Learning (RL) +1

What Can I Do Here? Learning New Skills by Imagining Visual Affordances

1 code implementation1 Jun 2021 Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine

In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.

DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies

no code implementations23 Apr 2021 Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair, Alexander Khazatsky, Glen Berseth, Sergey Levine

Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity.

reinforcement-learning Reinforcement Learning (RL) +1

Contingencies from Observations: Tractable Contingency Planning with Learned Behavior Models

1 code implementation21 Apr 2021 Nicholas Rhinehart, Jeff He, Charles Packer, Matthew A. Wright, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine

Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents.

Outcome-Driven Reinforcement Learning via Variational Inference

no code implementations NeurIPS 2021 Tim G. J. Rudner, Vitchyr H. Pong, Rowan Mcallister, Yarin Gal, Sergey Levine

While reinforcement learning algorithms provide automated acquisition of optimal policies, practical application of such methods requires a number of design decisions, such as manually designing reward functions that not only define the task, but also provide sufficient shaping to accomplish it.

reinforcement-learning Reinforcement Learning (RL) +1

MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale

no code implementations16 Apr 2021 Dmitry Kalashnikov, Jacob Varley, Yevgen Chebotar, Benjamin Swanson, Rico Jonschkowski, Chelsea Finn, Sergey Levine, Karol Hausman

In this paper, we study how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously, sharing exploration, experience, and representations across tasks.

reinforcement-learning Reinforcement Learning (RL)

Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

no code implementations15 Apr 2021 Yevgen Chebotar, Karol Hausman, Yao Lu, Ted Xiao, Dmitry Kalashnikov, Jake Varley, Alex Irpan, Benjamin Eysenbach, Ryan Julian, Chelsea Finn, Sergey Levine

We consider the problem of learning useful robotic skills from previously collected offline data without access to manually specified rewards or additional online exploration, a setting that is becoming increasingly important for scaling robot learning by reusing past robotic data.

Q-Learning reinforcement-learning +1

Rapid Exploration for Open-World Navigation with Latent Goal Models

no code implementations12 Apr 2021 Dhruv Shah, Benjamin Eysenbach, Nicholas Rhinehart, Sergey Levine

We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.

Autonomous Navigation

AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

2 code implementations5 Apr 2021 Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.

Imitation Learning Reinforcement Learning (RL)

Benchmarks for Deep Off-Policy Evaluation

3 code implementations ICLR 2021 Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Benchmarking Continuous Control +3

Accelerating Online Reinforcement Learning via Model-Based Meta-Learning

no code implementations ICLR Workshop Learning_to_Learn 2021 John D Co-Reyes, Sarah Feng, Glen Berseth, Jie Qui, Sergey Levine

Current reinforcement learning algorithms struggle to quickly adapt to new situations without large amounts of experience and usually without large amounts of optimization over that experience.

Meta-Learning reinforcement-learning +1

Maximum Entropy RL (Provably) Solves Some Robust RL Problems

no code implementations ICLR 2022 Benjamin Eysenbach, Sergey Levine

Many potential applications of reinforcement learning (RL) require guarantees that the agent will perform well in the face of disturbances to the dynamics or reward function.

Reinforcement Learning (RL)

Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation

no code implementations ICLR 2021 Justin Fu, Sergey Levine

We propose to tackle this problem by leveraging the normalized maximum-likelihood (NML) estimator, which provides a principled approach to handling uncertainty and out-of-distribution inputs.

COMBO: Conservative Offline Model-Based Policy Optimization

4 code implementations NeurIPS 2021 Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn

We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.

Offline RL

How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned

no code implementations4 Feb 2021 Julian Ibarz, Jie Tan, Chelsea Finn, Mrinal Kalakrishnan, Peter Pastor, Sergey Levine

Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains.

reinforcement-learning Reinforcement Learning (RL)

Evolving Reinforcement Learning Algorithms

5 code implementations ICLR 2021 John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust

Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.

Atari Games Meta-Learning +2

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments

no code implementations ICLR 2021 Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples

no code implementations1 Jan 2021 Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.

reinforcement-learning Reinforcement Learning (RL)

Invariant Representations for Reinforcement Learning without Reconstruction

no code implementations ICLR 2021 Amy Zhang, Rowan Thomas McAllister, Roberto Calandra, Yarin Gal, Sergey Levine

We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.

Causal Inference reinforcement-learning +2

Variable-Shot Adaptation for Incremental Meta-Learning

no code implementations1 Jan 2021 Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.

Meta-Learning Zero-Shot Learning

On Trade-offs of Image Prediction in Visual Model-Based Reinforcement Learning

no code implementations1 Jan 2021 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Dumitru Erhan, Harini Kannan, Chelsea Finn, Sergey Levine

In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.

Model-based Reinforcement Learning reinforcement-learning +1

Model-Based Visual Planning with Self-Supervised Functional Distances

1 code implementation ICLR 2021 Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot.

reinforcement-learning Reinforcement Learning (RL)

ViNG: Learning Open-World Navigation with Visual Goals

no code implementations17 Dec 2020 Dhruv Shah, Benjamin Eysenbach, Gregory Kahn, Nicholas Rhinehart, Sergey Levine

We propose a learning-based navigation system for reaching visually indicated goals and demonstrate this system on a real mobile robot platform.

Navigate reinforcement-learning +1

Variable-Shot Adaptation for Online Meta-Learning

no code implementations14 Dec 2020 Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.

Meta-Learning Zero-Shot Learning

Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

1 code implementation8 Dec 2020 Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan

In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Continual Learning of Control Primitives : Skill Discovery via Reset-Games

no code implementations NeurIPS 2020 Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

First, in real world settings, when an agent attempts a tasks and fails, the environment must somehow "reset" so that the agent can attempt the task again.

Continual Learning

Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation NeurIPS 2020 Michael Janner, Igor Mordatch, Sergey Levine

We introduce the gamma-model, a predictive model of environment dynamics with an infinite, probabilistic horizon.

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

no code implementations ICLR 2021 Avi Singh, Huihan Liu, Gaoyue Zhou, Albert Yu, Nicholas Rhinehart, Sergey Levine

Reinforcement learning provides a general framework for flexible decision making and control, but requires extensive data collection for each new task that an agent needs to learn.

Decision Making reinforcement-learning +1

C-Learning: Learning to Achieve Goals via Recursive Classification

no code implementations ICLR 2021 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

This problem, which can be viewed as a reframing of goal-conditioned reinforcement learning (RL), is centered around learning a conditional probability density function over future states.

Classification Density Estimation +3

Continual Learning of Control Primitives: Skill Discovery via Reset-Games

1 code implementation10 Nov 2020 Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine

Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.

Continual Learning

Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation

no code implementations5 Nov 2020 Aurick Zhou, Sergey Levine

In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.

Bayesian Inference

One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL

no code implementations NeurIPS 2020 Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn

While reinforcement learning algorithms can learn effective policies for complex tasks, these policies are often brittle to even minor task variations, especially when variations are not explicitly provided during training.

COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning

1 code implementation27 Oct 2020 Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine

Reinforcement learning has been applied to a wide variety of robotics problems, but most of such applications involve collecting data from scratch for each new task.

reinforcement-learning Reinforcement Learning (RL)

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

1 code implementation ICLR 2021 Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine

We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.

reinforcement-learning Reinforcement Learning (RL)

Conservative Safety Critics for Exploration

no code implementations ICLR 2021 Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg

Safe exploration presents a major challenge in reinforcement learning (RL): when active data collection requires deploying partially trained policies, we must ensure that these policies avoid catastrophically unsafe regions, while still enabling trial and error learning.

Reinforcement Learning (RL) Safe Exploration

Generative Temporal Difference Learning for Infinite-Horizon Prediction

1 code implementation27 Oct 2020 Michael Janner, Igor Mordatch, Sergey Levine

We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.

MELD: Meta-Reinforcement Learning from Images via Latent State Models

1 code implementation26 Oct 2020 Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine

Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.

Meta-Learning Meta Reinforcement Learning +3

OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning

no code implementations ICLR 2021 Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum

Reinforcement learning (RL) has achieved impressive performance in a variety of online settings in which an agent's ability to query the environment for transitions and rewards is effectively unlimited.

Few-Shot Imitation Learning Imitation Learning +3

LaND: Learning to Navigate from Disengagements

1 code implementation9 Oct 2020 Gregory Kahn, Pieter Abbeel, Sergey Levine

However, we believe that these disengagements not only show where the system fails, which is useful for troubleshooting, but also provide a direct learning signal by which the robot can learn to navigate.

Autonomous Navigation Imitation Learning +3

Emergent Social Learning via Multi-agent Reinforcement Learning

no code implementations1 Oct 2020 Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques

We analyze the reasons for this deficiency, and show that by imposing constraints on the training environment and introducing a model-based auxiliary loss we are able to obtain generalized social learning policies which enable agents to: i) discover complex skills that are not learned from single-agent training, and ii) adapt online to novel environments by taking cues from experts present in the new environment.

Imitation Learning Multi-agent Reinforcement Learning +2

Amortized Conditional Normalized Maximum Likelihood

no code implementations28 Sep 2020 Aurick Zhou, Sergey Levine

In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.