7 code implementations • 23 Apr 2017 • Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine
While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.
Ranked #3 on Video Alignment on UPenn Action
12 code implementations • NeurIPS 2018 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.
Hierarchical Reinforcement Learning reinforcement-learning +1
6 code implementations • CVPR 2017 • Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik
The accumulated belief of the world enables the agent to track visited regions of the environment.
2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih
Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.
7 code implementations • ICLR 2019 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning.
3 code implementations • NeurIPS 2016 • Chelsea Finn, Ian Goodfellow, Sergey Levine
A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment.
Ranked #26 on Video Generation on BAIR Robot Pushing
17 code implementations • 8 Jun 2015 • John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel
Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.
1 code implementation • NeurIPS 2019 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine
We introduce a general control algorithm that combines the strengths of planning and reinforcement learning to effectively solve these tasks.
1 code implementation • ICLR 2022 • Mengjiao Yang, Sergey Levine, Ofir Nachum
In this work, we answer this question affirmatively and present training objectives that use offline datasets to learn a factored transition model whose structure enables the extraction of a latent action space.
1 code implementation • ICLR 2020 • Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn
If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes.
6 code implementations • NeurIPS 2020 • Michael Dennis, Natasha Jaques, Eugene Vinitsky, Alexandre Bayen, Stuart Russell, Andrew Critch, Sergey Levine
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED).
1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Sergey Levine, Ruslan Salakhutdinov
Can we devise RL algorithms that instead enable users to specify tasks simply by providing examples of successful outcomes?
1 code implementation • ICLR 2022 • Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine
An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.
76 code implementations • ICML 2018 • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
A platform for Applied Reinforcement Learning (Applied RL)
Ranked #1 on Continuous Control on Lunar Lander (OpenAI Gym)
82 code implementations • ICML 2017 • Chelsea Finn, Pieter Abbeel, Sergey Levine
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning.
3 code implementations • 4 May 2020 • Sergey Levine, Aviral Kumar, George Tucker, Justin Fu
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection.
17 code implementations • NeurIPS 2020 • Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine
We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.
2 code implementations • 1 Mar 2019 • Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
Ranked #12 on Atari Games 100k on Atari 100k
1 code implementation • ICLR 2020 • Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma
Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.
Ranked #15 on Video Generation on BAIR Robot Pushing
50 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
5 code implementations • 1 Oct 2019 • Xue Bin Peng, Aviral Kumar, Grace Zhang, Sergey Levine
In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines.
Ranked #1 on OpenAI Gym on Humanoid-v2
21 code implementations • 19 Feb 2015 • John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel
We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.
3 code implementations • ICML 2017 • Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine
We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.
4 code implementations • 1 Mar 2016 • Chelsea Finn, Sergey Levine, Pieter Abbeel
We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems.
5 code implementations • ICLR 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.
11 code implementations • NeurIPS 2019 • Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine
Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.
Model-based Reinforcement Learning reinforcement-learning +1
15 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
2 code implementations • 20 May 2022 • Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine
Model-based reinforcement learning methods often use learning only for the purpose of estimating an approximate dynamics model, offloading the rest of the decision-making work to classical trajectory optimizers.
2 code implementations • NeurIPS 2018 • Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine
For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.
2 code implementations • ICML 2020 • Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine
Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills.
6 code implementations • 16 Jun 2020 • Ashvin Nair, Abhishek Gupta, Murtaza Dalal, Sergey Levine
If we can instead allow RL algorithms to effectively use previously collected data to aid the online learning process, such applications could be made substantially more practical: the prior data would provide a starting point that mitigates challenges due to exploration and sample complexity, while the online training enables the agent to perfect the desired skill.
1 code implementation • NeurIPS 2021 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine
Many of the challenges facing today's reinforcement learning (RL) algorithms, such as robustness, generalization, transfer, and computational efficiency are closely related to compression.
6 code implementations • 8 Apr 2018 • Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel Van de Panne
We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills.
1 code implementation • ICLR 2020 • Anirudh Goyal, Yoshua Bengio, Matthew Botvinick, Sergey Levine
This is typically the case when we have a standard conditioning input, such as a state observation, and a "privileged" input, which might correspond to the goal of a task, the output of a costly planning algorithm, or communication with another agent.
3 code implementations • 5 Apr 2021 • Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa
Our system produces high-quality motions that are comparable to those achieved by state-of-the-art tracking-based techniques, while also being able to easily accommodate large datasets of unstructured motion clips.
3 code implementations • NeurIPS 2021 • Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
6 code implementations • 14 Dec 2020 • Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, Percy Liang
Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild.
3 code implementations • NeurIPS 2019 • Aviral Kumar, Justin Fu, George Tucker, Sergey Levine
Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.
7 code implementations • 15 Apr 2020 • Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine
In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.
8 code implementations • 24 Oct 2019 • Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine
Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.
Ranked #1 on Meta-Learning on ML10
1 code implementation • 25 May 2023 • Arnav Gudibande, Eric Wallace, Charlie Snell, Xinyang Geng, Hao liu, Pieter Abbeel, Sergey Levine, Dawn Song
This approach looks to cheaply imitate the proprietary model's capabilities using a weaker open-source model.
10 code implementations • NeurIPS 2018 • Kurtland Chua, Roberto Calandra, Rowan Mcallister, Sergey Levine
Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance.
Model-based Reinforcement Learning reinforcement-learning +1
2 code implementations • 25 Sep 2019 • Anusha Nagabandi, Kurt Konoglie, Sergey Levine, Vikash Kumar
Dexterous multi-fingered hands can provide robots with the ability to flexibly perform a wide range of manipulation skills.
2 code implementations • 2 May 2018 • Sergey Levine
The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable.
5 code implementations • ICLR 2019 • Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine
By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.
1 code implementation • 15 Jul 2016 • William Montgomery, Sergey Levine
Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.
8 code implementations • NeurIPS 2016 • Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel
We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.
1 code implementation • 16 Nov 2018 • Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine
We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin.
1 code implementation • ICLR 2021 • Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine
We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network.
7 code implementations • ICLR Workshop LLD 2019 • Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
2 code implementations • NeurIPS 2021 • Michael Janner, Qiyang Li, Sergey Levine
Reinforcement learning (RL) is typically concerned with estimating stationary policies or single-step models, leveraging the Markov property to factorize problems in time.
1 code implementation • ICML Workshop URL 2021 • Michael Janner, Qiyang Li, Sergey Levine
However, we can also view RL as a sequence modeling problem, with the goal being to predict a sequence of actions that leads to a sequence of high rewards.
1 code implementation • 19 Mar 2018 • Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine
Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.
7 code implementations • 30 Oct 2017 • Justin Fu, Katie Luo, Sergey Levine
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.
Ranked #3 on MuJoCo Games on Ant
3 code implementations • ICLR 2019 • Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson
We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.
3 code implementations • 16 Apr 2019 • Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine
In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.
8 code implementations • 8 Aug 2017 • Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine
Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.
Model-based Reinforcement Learning Model Predictive Control +2
2 code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine
However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.
1 code implementation • 16 Mar 2020 • Akhil Padmanabha, Frederik Ebert, Stephen Tian, Roberto Calandra, Chelsea Finn, Sergey Levine
We compare with a state-of-the-art tactile sensor that is only sensitive on one side, as well as a state-of-the-art multi-directional tactile sensor, and find that OmniTact's combination of high-resolution and multi-directional sensing is crucial for reliably inserting the electrical connector and allows for higher accuracy in the state estimation task.
1 code implementation • 8 Oct 2018 • Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine
In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV).
4 code implementations • ICLR 2019 • Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine
However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.
Ranked #1 on Video Prediction on KTH (Cond metric)
9 code implementations • NeurIPS 2020 • Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.
2 code implementations • 5 Feb 2018 • Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey Levine
Humans and animals are capable of learning a new behavior by observing others perform the skill just once.
3 code implementations • 14 Sep 2017 • Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine
In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration.
1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich
Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.
2 code implementations • ICLR 2020 • Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.
1 code implementation • 5 Dec 2019 • Abdul Rahman Kreidieh, Glen Berseth, Brandon Trabucco, Samyak Parajuli, Sergey Levine, Alexandre M. Bayen
This allows us to draw on connections between communication and cooperation in multi-agent RL, and demonstrate the benefits of increased cooperation between sub-policies on the training performance of the overall policy.
Hierarchical Reinforcement Learning reinforcement-learning +1
3 code implementations • 11 Nov 2016 • Chelsea Finn, Paul Christiano, Pieter Abbeel, Sergey Levine
In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator.
1 code implementation • NeurIPS 2018 • Chelsea Finn, Kelvin Xu, Sergey Levine
However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.
1 code implementation • 16 Aug 2022 • Laura Smith, Ilya Kostrikov, Sergey Levine
Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.
1 code implementation • 7 Oct 2022 • Dhruv Shah, Ajay Sridhar, Arjun Bhorkar, Noriaki Hirose, Sergey Levine
Learning provides a powerful tool for vision-based navigation, but the capabilities of learning-based policies are constrained by limited training data.
6 code implementations • NeurIPS 2020 • Tianhe Yu, Garrett Thomas, Lantao Yu, Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma
We also characterize the trade-off between the gain and risk of leaving the support of the batch data.
4 code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn
We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.
1 code implementation • 28 Sep 2016 • Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel
To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.
1 code implementation • 12 Sep 2022 • Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine
In this work, we introduce a framework for training generalized locomotion (GenLoco) controllers for quadrupedal robots.
2 code implementations • ICLR 2019 • Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn
Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.
2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence
Large language models excel at a wide range of complex tasks.
Ranked #2 on Visual Question Answering (VQA) on OK-VQA
1 code implementation • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine
Large language models distill broad knowledge from text corpora.
2 code implementations • ICML 2020 • Angelos Filos, Panagiotis Tigas, Rowan Mcallister, Nicholas Rhinehart, Sergey Levine, Yarin Gal
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment, typically leading to arbitrary deductions and poorly-informed decisions.
3 code implementations • 2 Jul 2019 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
2 code implementations • 27 Apr 2020 • Archit Sharma, Michael Ahn, Sergey Levine, Vikash Kumar, Karol Hausman, Shixiang Gu
Can we instead develop efficient reinforcement learning methods that acquire diverse skills without any reward function, and then repurpose these skills for downstream tasks?
1 code implementation • ICLR 2020 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
3 code implementations • 15 Oct 2017 • Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine
One learning signal that is always available for autonomously collected data is prediction: if a robot can learn to predict the future, it can use this predictive model to take actions to produce desired outcomes, such as moving an object to a particular location.
1 code implementation • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine
Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).
1 code implementation • 10 Jul 2022 • Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
Goal-conditioned policies for robotic navigation can be trained on large, unannotated datasets, providing for good generalization to real-world settings.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
3 code implementations • 4 Apr 2022 • Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng
We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment.
1 code implementation • 25 May 2018 • Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine
Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.
8 code implementations • NeurIPS 2020 • Alex X. Lee, Anusha Nagabandi, Pieter Abbeel, Sergey Levine
Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations.
1 code implementation • 13 Feb 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine
Mobile robot navigation is typically regarded as a geometric problem, in which the robot's objective is to perceive the geometry of the environment in order to plan collision-free paths towards a desired goal.
2 code implementations • 18 Jun 2020 • Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey Levine
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
1 code implementation • 3 Dec 2018 • Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine
Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains.
2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp
In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.
6 code implementations • NeurIPS 2019 • Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine
By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.
3 code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf
Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes.
1 code implementation • 8 Dec 2020 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan
In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.
Model-based Reinforcement Learning Reinforcement Learning (RL)
2 code implementations • ICML 2017 • Jacob Andreas, Dan Klein, Sergey Levine
We describe a framework for multitask deep reinforcement learning guided by policy sketches.
2 code implementations • 29 Sep 2017 • Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine
To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.
1 code implementation • 27 May 2022 • Xinyang Geng, Hao liu, Lisa Lee, Dale Schuurmans, Sergey Levine, Pieter Abbeel
We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.
2 code implementations • ICCV 2019 • Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine
For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.
3 code implementations • NeurIPS 2020 • Aviral Kumar, Abhishek Gupta, Sergey Levine
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from this corrective feedback, and training on the experience collected by the algorithm is not sufficient to correct errors in the Q-function.
Ranked #3 on Meta-Learning on MT50
8 code implementations • 2 Mar 2016 • Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.
1 code implementation • ICML 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike
To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.
3 code implementations • ICLR 2021 • Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.
1 code implementation • 24 Aug 2023 • Homer Walke, Kevin Black, Abraham Lee, Moo Jin Kim, Max Du, Chongyi Zheng, Tony Zhao, Philippe Hansen-Estruch, Quan Vuong, Andre He, Vivek Myers, Kuan Fang, Chelsea Finn, Sergey Levine
By publicly sharing BridgeData V2 and our pre-trained models, we aim to accelerate research in scalable robot learning methods.
1 code implementation • 11 Feb 2019 • Katie Kang, Suneel Belkhale, Gregory Kahn, Pieter Abbeel, Sergey Levine
Deep reinforcement learning provides a promising approach for vision-based control of real-world robots.
1 code implementation • 10 Apr 2023 • Dibya Ghosh, Chethan Bhateja, Sergey Levine
Passive observational data, such as human videos, is abundant and rich in information, yet remains largely untapped by current RL methods.
1 code implementation • 21 Jun 2018 • Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik
The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.
3 code implementations • 17 Feb 2022 • Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine
To address this, we present Design-Bench, a benchmark for offline MBO with a unified evaluation protocol and reference implementations of recent methods.
2 code implementations • 1 Nov 2022 • Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell
The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack.
1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine
In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.
2 code implementations • ICLR 2021 • Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine
Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.
Multi-Goal Reinforcement Learning Reinforcement Learning (RL)
1 code implementation • 6 Feb 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.
1 code implementation • ICLR 2020 • Nicholas Rhinehart, Rowan Mcallister, Sergey Levine
Yet, reward functions that evoke desirable behavior are often difficult to specify.
1 code implementation • ICLR 2019 • Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J. Johnson, Sergey Levine
Model-based reinforcement learning (RL) has proven to be a data efficient approach for learning control tasks but is difficult to utilize in domains with complex observations such as images.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 20 Dec 2021 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
1 code implementation • NeurIPS 2016 • Pulkit Agrawal, Ashvin Nair, Pieter Abbeel, Jitendra Malik, Sergey Levine
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics.
1 code implementation • 12 Jun 2019 • Lisa Lee, Benjamin Eysenbach, Emilio Parisotto, Eric Xing, Sergey Levine, Ruslan Salakhutdinov
The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.
1 code implementation • 24 Jun 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan
There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.
Ranked #6 on Video Generation on BAIR Robot Pushing
1 code implementation • NeurIPS 2011 • Sergey Levine, Zoran Popovic, Vladlen Koltun
We present a probabilistic algorithm for nonlinear inverse reinforcement learning.
1 code implementation • 26 Oct 2020 • Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine
Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.
1 code implementation • ICLR 2018 • Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.
2 code implementations • NeurIPS 2023 • Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine
Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.
1 code implementation • NeurIPS 2023 • Seohong Park, Dibya Ghosh, Benjamin Eysenbach, Sergey Levine
This structure can be very useful, as assessing the quality of actions for nearby goals is typically easier than for more distant goals.
3 code implementations • ICLR 2019 • Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine
On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.
1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang
Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.
2 code implementations • NAACL 2022 • Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine
Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.
1 code implementation • 29 Feb 2024 • Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar
In this paper, we develop a framework for building multi-turn RL algorithms for fine-tuning LLMs, that preserves the flexibility of existing single-turn RL methods for LLMs (e. g., proximal policy optimization), while accommodating multiple turns, long horizons, and delayed rewards effectively.
1 code implementation • 28 Oct 2019 • Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine
This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before.
1 code implementation • 20 Feb 2020 • Sehoon Ha, Peng Xu, Zhenyu Tan, Sergey Levine, Jie Tan
In this paper, we develop a system for learning legged locomotion policies with deep RL in the real world with minimal human effort.
1 code implementation • 30 Nov 2023 • Marwa Abdulhai, Isadora White, Charlie Snell, Charles Sun, Joey Hong, Yuexiang Zhai, Kelvin Xu, Sergey Levine
Developing such algorithms requires tasks that can gauge progress on algorithm design, provide accessible and reproducible evaluations for multi-turn interactions, and cover a range of task properties and challenges in improving reinforcement learning algorithms.
1 code implementation • 23 Feb 2024 • Seohong Park, Tobias Kreiman, Sergey Levine
While a number of methods have been proposed to enable generic self-supervised RL, based on principles such as goal-conditioned RL, behavioral cloning, and unsupervised skill learning, such methods remain limited in terms of either the diversity of the discovered behaviors, the need for high-quality demonstration data, or the lack of a clear prompting or adaptation mechanism for downstream tasks.
1 code implementation • NAACL 2018 • Jacob Andreas, Dan Klein, Sergey Levine
The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world.
1 code implementation • 11 Jul 2017 • YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.
2 code implementations • 14 Jul 2021 • Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine
Computational design problems arise in a number of settings, from synthetic biology to computer architectures.
1 code implementation • NeurIPS 2019 • Soroush Nasiriany, Vitchyr H. Pong, Steven Lin, Sergey Levine
Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors.
2 code implementations • 18 Oct 2021 • Marvin Zhang, Sergey Levine, Chelsea Finn
We study the problem of test time robustification, i. e., using the test input to improve model robustness.
5 code implementations • ICLR 2021 • John D. Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Sergey Levine, Quoc V. Le, Honglak Lee, Aleksandra Faust
Learning from scratch on simple classical control and gridworld tasks, our method rediscovers the temporal-difference (TD) algorithm.
1 code implementation • 10 Jul 2017 • Rouhollah Rahmatizadeh, Pooya Abolghasemi, Ladislau Bölöni, Sergey Levine
We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation.
1 code implementation • NeurIPS 2020 • Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations.
2 code implementations • 13 Aug 2020 • Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn
That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks in order to adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task.
2 code implementations • NeurIPS 2019 • Pim de Haan, Dinesh Jayaraman, Sergey Levine
Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment.
2 code implementations • NeurIPS 2018 • Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel, Sergey Levine
Exploration is a fundamental challenge in reinforcement learning (RL).
1 code implementation • 27 Feb 2024 • Kevin Frans, Seohong Park, Pieter Abbeel, Sergey Levine
Can we pre-train a generalist agent from a large amount of unlabeled offline trajectories such that it can be immediately adapted to any new downstream tasks in a zero-shot manner?
1 code implementation • 9 Oct 2020 • Gregory Kahn, Pieter Abbeel, Sergey Levine
However, we believe that these disengagements not only show where the system fails, which is useful for troubleshooting, but also provide a direct learning signal by which the robot can learn to navigate.
1 code implementation • 13 Oct 2023 • Seohong Park, Oleh Rybkin, Sergey Levine
Through our experiments in five locomotion and manipulation environments, we demonstrate that METRA can discover a variety of useful behaviors even in complex, pixel-based environments, being the first unsupervised RL method that discovers diverse locomotion behaviors in pixel-based Quadruped and Humanoid.
1 code implementation • 27 Oct 2020 • Michael Janner, Igor Mordatch, Sergey Levine
We introduce the $\gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
1 code implementation • NeurIPS 2020 • Michael Janner, Igor Mordatch, Sergey Levine
We introduce the gamma-model, a predictive model of environment dynamics with an infinite, probabilistic horizon.
1 code implementation • 2 Oct 2018 • Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song
Reinforcement learning algorithms struggle when the reward signal is very sparse.
1 code implementation • 9 Apr 2024 • Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, Alane Suhr
We show that domain-general automatic evaluators can significantly improve the performance of agents for web navigation and device control.
2 code implementations • ICLR 2022 • Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn
In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.
1 code implementation • 2 Apr 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.
2 code implementations • 31 Mar 2017 • Alex X. Lee, Sergey Levine, Pieter Abbeel
Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.
1 code implementation • 16 Oct 2018 • Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine
We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks.
1 code implementation • ICML 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.
1 code implementation • 27 Oct 2020 • Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine
Reinforcement learning has been applied to a wide variety of robotics problems, but most of such applications involve collecting data from scratch for each new task.
1 code implementation • 21 Apr 2021 • Nicholas Rhinehart, Jeff He, Charles Packer, Matthew A. Wright, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine
Humans have a remarkable ability to make decisions by accurately reasoning about future events, including the future behaviors and states of mind of other agents.
1 code implementation • 24 Jun 2021 • Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine
The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.
Model-based Reinforcement Learning reinforcement-learning +1
3 code implementations • ICLR 2018 • Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine
We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods.
Ranked #5 on Video Prediction on KTH
1 code implementation • 23 Jun 2020 • Oleh Rybkin, Kostas Daniilidis, Sergey Levine
We perform the first comprehensive comparative analysis of calibrated decoder and provide recommendations for simple and effective VAE training.
1 code implementation • 12 Nov 2020 • Karl Schmeckpeper, Oleh Rybkin, Kostas Daniilidis, Sergey Levine, Chelsea Finn
In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans?
1 code implementation • 11 Oct 2022 • Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine
To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.
1 code implementation • 6 Mar 2024 • Benjamin Eysenbach, Vivek Myers, Ruslan Salakhutdinov, Sergey Levine
The key idea is to apply a variant of contrastive learning to time series data.
1 code implementation • NeurIPS 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.
2 code implementations • 8 Feb 2023 • Seohong Park, Sergey Levine
A key component of model-based reinforcement learning (RL) is a dynamics model that predicts the outcomes of actions.
Model-based Reinforcement Learning Reinforcement Learning (RL)
1 code implementation • ICLR 2019 • Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths
A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.
2 code implementations • 23 Apr 2020 • Suneel Belkhale, Rachel Li, Gregory Kahn, Rowan Mcallister, Roberto Calandra, Sergey Levine
Our experiments demonstrate that our online adaptation approach outperforms non-adaptive methods on a series of challenging suspended payload transportation tasks.
1 code implementation • 24 May 2022 • Siddharth Reddy, Sergey Levine, Anca D. Dragan
How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?
2 code implementations • 27 Sep 2021 • Frederik Ebert, Yanlai Yang, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, Sergey Levine
Robot learning holds the promise of learning policies that generalize broadly.
1 code implementation • 6 Oct 2021 • Benjamin Eysenbach, Alexander Khazatsky, Sergey Levine, Ruslan Salakhutdinov
Many model-based reinforcement learning (RL) methods follow a similar template: fit a model to previously observed data, and then use data from that model for RL or planning.
Model-based Reinforcement Learning Reinforcement Learning (RL)
1 code implementation • ICLR 2022 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine
In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function.
1 code implementation • 25 Oct 2019 • Abhishek Gupta, Vikash Kumar, Corey Lynch, Sergey Levine, Karol Hausman
We present relay policy learning, a method for imitation and reinforcement learning that can solve multi-stage, long-horizon robotic tasks.
1 code implementation • 14 Aug 2017 • Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine
We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.
1 code implementation • 10 Nov 2020 • Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine
Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.
1 code implementation • ICLR 2021 • Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine
In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot.
1 code implementation • 16 Dec 2022 • Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine
Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.
1 code implementation • NeurIPS 2023 • Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine
Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.
1 code implementation • 26 Feb 2019 • Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine
Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).
1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
1 code implementation • 9 Aug 2022 • Marwa Abdulhai, Natasha Jaques, Sergey Levine
IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them.
1 code implementation • 23 Oct 2019 • Ashvin Nair, Shikhar Bahl, Alexander Khazatsky, Vitchyr Pong, Glen Berseth, Sergey Levine
When the robot's environment and available objects vary, as they do in most open-world settings, the robot must propose to itself only those goals that it can accomplish in its present setting with the objects that are at hand.
1 code implementation • 31 Dec 2019 • Aviral Kumar, Xue Bin Peng, Sergey Levine
By then conditioning the policy on the numerical value of the reward, we can obtain a policy that generalizes to larger returns.
1 code implementation • ICLR 2018 • Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine
In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.
3 code implementations • 6 Oct 2018 • Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn
We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation.
1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
2 code implementations • 1 Jun 2021 • Alexander Khazatsky, Ashvin Nair, Daniel Jing, Sergey Levine
In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model, attempt to reach them, and thereby update both its skills and its outcome model.
1 code implementation • 12 Oct 2023 • Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn
Can we leverage offline RL to recover better policies from online interaction?
1 code implementation • 3 Oct 2016 • Chelsea Finn, Sergey Levine
A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback.
Model-based Reinforcement Learning Model Predictive Control +2
1 code implementation • ICML 2020 • Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment.
1 code implementation • 8 Jul 2021 • Vitchyr H. Pong, Ashvin Nair, Laura Smith, Catherine Huang, Sergey Levine
If we can meta-train on offline data, then we can reuse the same static dataset, labeled once with rewards for different tasks, to meta-train policies that adapt to a variety of new tasks at meta-test time.
1 code implementation • 6 Jun 2023 • Chongyi Zheng, Benjamin Eysenbach, Homer Walke, Patrick Yin, Kuan Fang, Ruslan Salakhutdinov, Sergey Levine
Robotic systems that rely primarily on self-supervised learning have the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
1 code implementation • 22 Sep 2017 • Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke
We extensively evaluate our approaches with a total of more than 25, 000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN.
1 code implementation • 24 Oct 2021 • Sergey Levine
The recent history of machine learning research has taught us that machine learning methods can be most effective when they are provided with very large, high-capacity models, and trained on very large and diverse datasets.
1 code implementation • NeurIPS 2019 • Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine
In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.
1 code implementation • NeurIPS 2020 • Benjamin Eysenbach, Xinyang Geng, Sergey Levine, Ruslan Salakhutdinov
In this paper, we show that hindsight relabeling is inverse RL, an observation that suggests that we can use inverse RL in tandem for RL algorithms to efficiently solve many tasks.
1 code implementation • NeurIPS 2021 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.
1 code implementation • 22 Sep 2021 • Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine
To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.
1 code implementation • NeurIPS 2017 • Justin Fu, John D. Co-Reyes, Sergey Levine
Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes.
1 code implementation • 11 Jun 2019 • Shagun Sodhani, Anirudh Goyal, Tristan Deleu, Yoshua Bengio, Sergey Levine, Jian Tang
There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment.