no code implementations • NeurIPS 2010 • Sergey Levine, Zoran Popovic, Vladlen Koltun
The goal of inverse reinforcement learning is to find a reward function for a Markov decision process, given example traces from its optimal policy.
1 code implementation • NeurIPS 2011 • Sergey Levine, Zoran Popovic, Vladlen Koltun
We present a probabilistic algorithm for nonlinear inverse reinforcement learning.
no code implementations • 7 Nov 2013 • Sergey Levine
In this paper, we explore the application of deep and recurrent neural networks to a continuous, high-dimensional locomotion task, where the network is used to represent a control policy that maps the state of the system (represented by joint angles) directly to the torques at each joint.
no code implementations • NeurIPS 2013 • Sergey Levine, Vladlen Koltun
In order to learn effective control policies for dynamical systems, policy search methods must be able to discover successful executions of the desired task.
no code implementations • NeurIPS 2014 • Sergey Levine, Pieter Abbeel
We present a policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems.
no code implementations • 22 Jan 2015 • Sergey Levine, Nolan Wagener, Pieter Abbeel
Autonomous learning of object manipulation skills can enable robots to acquire rich behavioral repertoires that scale to the variety of objects found in the real world.
Robotics
21 code implementations • 19 Feb 2015 • John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel
We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.
no code implementations • 2 Apr 2015 • Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control.
17 code implementations • 8 Jun 2015 • John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel
Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.
1 code implementation • 3 Jul 2015 • Bradly C. Stadie, Sergey Levine, Pieter Abbeel
By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces.
Ranked #24 on Atari Games on Atari 2600 Q*Bert
no code implementations • 5 Jul 2015 • Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel
We evaluate our method on tasks involving continuous control in manipulation and navigation settings, and show that our method can learn complex policies that successfully complete a range of tasks that require memory.
no code implementations • ICCV 2015 • Katerina Fragkiadaki, Sergey Levine, Panna Felsen, Jitendra Malik
We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture.
Ranked #8 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)
1 code implementation • 21 Sep 2015 • Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel
Our method uses a deep spatial autoencoder to acquire a set of feature points that describe the environment for the current task, such as the positions of objects, and then learns a motion skill with these feature points using an efficient reinforcement learning method based on local linear models.
no code implementations • 22 Sep 2015 • Tianhao Zhang, Gregory Kahn, Sergey Levine, Pieter Abbeel
We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment.
no code implementations • 23 Sep 2015 • Justin Fu, Sergey Levine, Pieter Abbeel
One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand.
Model-based Reinforcement Learning Model Predictive Control +3
no code implementations • 23 Sep 2015 • Christopher Xie, Sachin Patil, Teodor Moldovan, Sergey Levine, Pieter Abbeel
In this paper, we present a robotic model-based reinforcement learning method that combines ideas from model identification and model predictive control.
Model-based Reinforcement Learning Model Predictive Control +2
2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih
Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.
no code implementations • 23 Nov 2015 • Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell
We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains.
no code implementations • 23 Nov 2015 • Katerina Fragkiadaki, Pulkit Agrawal, Sergey Levine, Jitendra Malik
The ability to plan and execute goal specific actions in varied, unexpected settings is a central requirement of intelligent agents.
8 code implementations • NeurIPS 2016 • Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel
We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.
4 code implementations • 1 Mar 2016 • Chelsea Finn, Sergey Levine, Pieter Abbeel
We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems.
8 code implementations • 2 Mar 2016 • Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine
In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.
no code implementations • 2 Mar 2016 • Gregory Kahn, Tianhao Zhang, Sergey Levine, Pieter Abbeel
PLATO also maintains the MPC cost as an objective to avoid highly undesirable actions that would result from strictly following the learned policy before it has been fully trained.
no code implementations • 7 Mar 2016 • Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen
We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images.
no code implementations • 21 Mar 2016 • Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel
In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.
1 code implementation • NeurIPS 2016 • Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel
We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.
3 code implementations • NeurIPS 2016 • Chelsea Finn, Ian Goodfellow, Sergey Levine
A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment.
Ranked #26 on Video Generation on BAIR Robot Pushing
1 code implementation • NeurIPS 2016 • Pulkit Agrawal, Ashvin Nair, Pieter Abbeel, Jitendra Malik, Sergey Levine
We investigate an experiential learning paradigm for acquiring an internal model of intuitive physics.
1 code implementation • 15 Jul 2016 • William Montgomery, Sergey Levine
Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.
no code implementations • 22 Sep 2016 • Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine
Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.
1 code implementation • 28 Sep 2016 • Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel
To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.
no code implementations • 28 Sep 2016 • Marvin Zhang, Xinyang Geng, Jonathan Bruce, Ken Caluwaerts, Massimo Vespignani, Vytas SunSpiral, Pieter Abbeel, Sergey Levine
We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities.
no code implementations • 3 Oct 2016 • Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine
We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI2), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization.
no code implementations • 3 Oct 2016 • Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine
In this work, we explore distributed and asynchronous policy learning as a means to achieve generalization and improved training times on challenging, real-world manipulation tasks.
1 code implementation • 3 Oct 2016 • Chelsea Finn, Sergey Levine
A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback.
Model-based Reinforcement Learning Model Predictive Control +2
no code implementations • 3 Oct 2016 • Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine
In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.
no code implementations • 4 Oct 2016 • William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine
Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering.
no code implementations • 5 Oct 2016 • Aravind Rajeswaran, Sarvjeet Ghotra, Balaraman Ravindran, Sergey Levine
Sample complexity and safety are major challenges when learning policies with reinforcement learning for real-world tasks, especially when the policies are represented using rich function approximators like deep neural networks.
2 code implementations • ICML 2017 • Jacob Andreas, Dan Klein, Sergey Levine
We describe a framework for multitask deep reinforcement learning guided by policy sketches.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
3 code implementations • 11 Nov 2016 • Chelsea Finn, Paul Christiano, Pieter Abbeel, Sergey Levine
In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator.
1 code implementation • 13 Nov 2016 • Fereshteh Sadeghi, Sergey Levine
We propose a learning method that we call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models.
no code implementations • 15 Nov 2016 • Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine
We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.
no code implementations • NeurIPS 2016 • William H. Montgomery, Sergey Levine
Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.
no code implementations • 1 Dec 2016 • Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine
We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available.
no code implementations • 20 Dec 2016 • Pierre Sermanet, Kelvin Xu, Sergey Levine
We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.
no code implementations • 3 Feb 2017 • Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine
However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot.
6 code implementations • CVPR 2017 • Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik
The accumulated belief of the world enables the agent to track visited regions of the environment.
3 code implementations • ICML 2017 • Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine
We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.
1 code implementation • NeurIPS 2017 • Justin Fu, John D. Co-Reyes, Sergey Levine
Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes.
no code implementations • 6 Mar 2017 • Ashvin Nair, Dian Chen, Pulkit Agrawal, Phillip Isola, Pieter Abbeel, Jitendra Malik, Sergey Levine
Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics.
no code implementations • 8 Mar 2017 • Abhishek Gupta, Coline Devin, Yuxuan Liu, Pieter Abbeel, Sergey Levine
People can learn a wide range of tasks from their own experience, but can also learn from observing other creatures.
82 code implementations • ICML 2017 • Chelsea Finn, Pieter Abbeel, Sergey Levine
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning.
no code implementations • 27 Mar 2017 • Somil Bansal, Roberto Calandra, Ted Xiao, Sergey Levine, Claire J. Tomlin
Real-world robots are becoming increasingly complex and commonly act in poorly understood environments where it is extremely challenging to model or learn their true dynamics.
2 code implementations • 31 Mar 2017 • Alex X. Lee, Sergey Levine, Pieter Abbeel
Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.
7 code implementations • 23 Apr 2017 • Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine
While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.
Ranked #3 on Video Alignment on UPenn Action
no code implementations • NeurIPS 2017 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques.
no code implementations • 6 Jul 2017 • Eric Jang, Sudheendra Vijayanarasimhan, Peter Pastor, Julian Ibarz, Sergey Levine
We consider the task of semantic robotic grasping, in which a robot picks up an object of a user-specified class using only monocular images.
1 code implementation • 10 Jul 2017 • Rouhollah Rahmatizadeh, Pooya Abolghasemi, Ladislau Bölöni, Sergey Levine
We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation.
1 code implementation • 11 Jul 2017 • YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.
no code implementations • ICCV 2017 • Avi Singh, Larry Yang, Sergey Levine
We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose.
8 code implementations • 8 Aug 2017 • Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine
Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.
Model-based Reinforcement Learning Model Predictive Control +2
1 code implementation • 14 Aug 2017 • Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine
We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.
no code implementations • 10 Sep 2017 • Somil Bansal, Roberto Calandra, Kurtland Chua, Sergey Levine, Claire Tomlin
Reinforcement Learning is divided in two main paradigms: model-free and model-based.
3 code implementations • 14 Sep 2017 • Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine
In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration.
1 code implementation • 22 Sep 2017 • Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke
We extensively evaluate our approaches with a total of more than 25, 000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN.
1 code implementation • 28 Sep 2017 • Aravind Rajeswaran, Vikash Kumar, Abhishek Gupta, Giulia Vezzani, John Schulman, Emanuel Todorov, Sergey Levine
Furthermore, deployment of DRL on physical systems remains challenging due to sample inefficiency.
2 code implementations • 29 Sep 2017 • Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine
To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.
3 code implementations • 15 Oct 2017 • Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine
One learning signal that is always available for autonomously collected data is prediction: if a robot can learn to predict the future, it can use this predictive model to take actions to produce desired outcomes, such as moving an object to a particular location.
1 code implementation • 16 Oct 2017 • Roberto Calandra, Andrew Owens, Manu Upadhyaya, Wenzhen Yuan, Justin Lin, Edward H. Adelson, Sergey Levine
In this work, we investigate the question of whether touch sensing aids in predicting grasp outcomes within a multimodal sensing framework that combines vision and touch.
3 code implementations • ICLR 2018 • Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine
We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods.
Ranked #5 on Video Prediction on KTH
7 code implementations • 30 Oct 2017 • Justin Fu, Katie Luo, Sergey Levine
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.
Ranked #3 on MuJoCo Games on Ant
1 code implementation • ICML 2018 • Peter Jin, Kurt Keutzer, Sergey Levine
Deep reinforcement learning algorithms that estimate state and state-action value functions have been shown to be effective in a variety of challenging domains, including learning control strategies from raw image pixels.
no code implementations • ICLR 2018 • Chelsea Finn, Sergey Levine
Learning to learn is a powerful paradigm for enabling models to learn from data more effectively and efficiently.
1 code implementation • NAACL 2018 • Jacob Andreas, Dan Klein, Sergey Levine
The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world.
no code implementations • 14 Nov 2017 • Anusha Nagabandi, Guangzhao Yang, Thomas Asmar, Ravi Pandya, Gregory Kahn, Sergey Levine, Ronald S. Fearing
We present an approach for controlling a real-world legged millirobot that is based on learned neural network models.
1 code implementation • ICLR 2018 • Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine
In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.
1 code implementation • ICLR 2018 • Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.
no code implementations • 20 Dec 2017 • Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine
To this end, we train a deep recurrent controller that can automatically determine which actions move the end-point of a robotic arm to a desired object.
no code implementations • 21 Dec 2017 • Saurabh Gupta, David Fouhey, Sergey Levine, Jitendra Malik
This works presents a formulation for visual navigation that unifies map based spatial reasoning and path planning, with landmark based robust plan execution in noisy environments.
no code implementations • ICLR 2018 • Alex X. Lee, Frederik Ebert, Richard Zhang, Chelsea Finn, Pieter Abbeel, Sergey Levine
In this paper, we study the problem of multi-step video prediction, where the goal is to predict a sequence of future frames conditioned on a short context.
no code implementations • ICLR 2018 • Justin Fu, Katie Luo, Sergey Levine
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.
no code implementations • ICLR 2018 • Tong Che, Yuchen Lu, George Tucker, Surya Bhupatiraju, Shane Gu, Sergey Levine, Yoshua Bengio
Model-free deep reinforcement learning algorithms are able to successfully solve a wide range of continuous control tasks, but typically require many on-policy samples to achieve good performance.
76 code implementations • ICML 2018 • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
A platform for Applied Reinforcement Learning (Applied RL)
Ranked #1 on Continuous Control on Lunar Lander (OpenAI Gym)
no code implementations • ICLR 2018 • Erin Grant, Chelsea Finn, Sergey Levine, Trevor Darrell, Thomas Griffiths
Meta-learning allows an intelligent agent to leverage prior learning episodes as a basis for quickly improving performance on a novel task.
2 code implementations • 5 Feb 2018 • Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, Sergey Levine
Humans and animals are capable of learning a new behavior by observing others perform the skill just once.
1 code implementation • 6 Feb 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.
no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell
We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.
3 code implementations • ICLR 2019 • Benjamin Eysenbach, Abhishek Gupta, Julian Ibarz, Sergey Levine
On a variety of simulated robotic tasks, we show that this simple objective results in the unsupervised emergence of diverse skills, such as walking and jumping.
2 code implementations • NeurIPS 2018 • Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel, Sergey Levine
Exploration is a fundamental challenge in reinforcement learning (RL).
no code implementations • ICLR 2018 • Vitchyr Pong, Shixiang Gu, Murtaza Dalal, Sergey Levine
TDMs combine the benefits of model-free and model-based RL: they leverage the rich information in state transitions to learn very efficiently, while still attaining asymptotic performance that exceeds that of direct model-based RL methods.
1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
no code implementations • 28 Feb 2018 • Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael. I. Jordan, Joseph E. Gonzalez, Sergey Levine
By enabling wider use of learned dynamics models within a model-free reinforcement learning algorithm, we improve value estimation, which, in turn, reduces the sample complexity of learning.
1 code implementation • 28 Feb 2018 • Deirdre Quillen, Eric Jang, Ofir Nachum, Chelsea Finn, Julian Ibarz, Sergey Levine
In this paper, we explore deep reinforcement learning algorithms for vision-based robotic grasping.
no code implementations • 1 Mar 2018 • Brian Yang, Grant Wang, Roberto Calandra, Daniel Contreras, Sergey Levine, Kristofer Pister
This approach formalizes locomotion as a contextual policy search task to collect data, and subsequently uses that data to learn multi-objective locomotion primitives that can be used for planning.
1 code implementation • 19 Mar 2018 • Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine
Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.
2 code implementations • ICLR 2019 • Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn
Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.
no code implementations • 31 Mar 2018 • Łukasz Kidziński, Sharada P. Mohanty, Carmichael Ong, Jennifer L. Hicks, Sean F. Carroll, Sergey Levine, Marcel Salathé, Scott L. Delp
Synthesizing physiologically-accurate human movement in a variety of conditions can help practitioners plan surgeries, design experiments, or prototype assistive devices in simulated environments, reducing time and costs and improving treatment outcomes.
2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp
In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.
no code implementations • ICLR 2019 • Anirudh Goyal, Philemon Brakel, William Fedus, Soumye Singhal, Timothy Lillicrap, Sergey Levine, Hugo Larochelle, Yoshua Bengio
In many environments only a tiny subset of all states yield high reward.
1 code implementation • 2 Apr 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.
4 code implementations • ICLR 2019 • Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine
However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.
Ranked #1 on Video Prediction on KTH (Cond metric)
6 code implementations • 8 Apr 2018 • Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel Van de Panne
We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills.
no code implementations • ICML 2018 • Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine
In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.
Hierarchical Reinforcement Learning reinforcement-learning +1
2 code implementations • 2 May 2018 • Sergey Levine
The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable.
12 code implementations • NeurIPS 2018 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.
Hierarchical Reinforcement Learning reinforcement-learning +1
1 code implementation • NeurIPS 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.
1 code implementation • 25 May 2018 • Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine
Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.
no code implementations • 28 May 2018 • Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward H. Adelson, Sergey Levine
This model -- a deep, multimodal convolutional network -- predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions.
no code implementations • NeurIPS 2018 • Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine
We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.
10 code implementations • NeurIPS 2018 • Kurtland Chua, Roberto Calandra, Rowan Mcallister, Sergey Levine
Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 31 May 2018 • Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn
A significant challenge for the practical application of reinforcement learning in the real world is the need to specify an oracle reward function that correctly defines a task.
no code implementations • CVPR 2018 • Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine
In robotics, this ability is referred to as visual servoing: moving a tool or end-point to a desired location using primarily visual feedback.
1 code implementation • NeurIPS 2018 • Chelsea Finn, Kelvin Xu, Sergey Levine
However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.
no code implementations • ICML 2018 • John D. Co-Reyes, Yuxuan Liu, Abhishek Gupta, Benjamin Eysenbach, Pieter Abbeel, Sergey Levine
We show that we can learn continuous latent representations of trajectories, which are effective in solving temporally extended and multi-stage problems.
Hierarchical Reinforcement Learning reinforcement-learning +2
no code implementations • ICLR 2020 • Abhishek Gupta, Benjamin Eysenbach, Chelsea Finn, Sergey Levine
In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by utilizing experience from prior tasks.
1 code implementation • 21 Jun 2018 • Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik
The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.
1 code implementation • 27 Jun 2018 • Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine
In this paper, we study the problem of learning vision-based dynamic manipulation skills using a scalable reinforcement learning approach.
1 code implementation • ICML 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.
2 code implementations • NeurIPS 2018 • Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine
For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.
1 code implementation • ICLR 2019 • Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths
A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.
no code implementations • ICLR 2019 • Dinesh Jayaraman, Frederik Ebert, Alexei A. Efros, Sergey Levine
Prediction is arguably one of the most basic functions of an intelligent system.
1 code implementation • ICLR 2019 • Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J. Johnson, Sergey Levine
Model-based reinforcement learning (RL) has proven to be a data efficient approach for learning control tasks but is difficult to utilize in domains with complex observations such as images.
Model-based Reinforcement Learning reinforcement-learning +1
3 code implementations • ICLR 2019 • Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson
We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.
no code implementations • 27 Sep 2018 • Kurtland Chua, Rowan Mcallister, Roberto Calandra, Sergey Levine
We show that both challenges can be addressed by representing model-uncertainty, which can both guide exploration in the unsupervised phase and ensure that the errors in the model are not exploited by the planner in the goal-directed phase.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 27 Sep 2018 • HyoungSeok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song
Policy optimization struggles when the reward feedback signal is very sparse and essentially becomes a random search algorithm until the agent stumbles upon a rewarding or the goal state.
no code implementations • 27 Sep 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Learning to imitate expert actions given demonstrations containing image observations is a difficult problem in robotic control.
no code implementations • 30 Sep 2018 • Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn
To that end, we formulate the few-shot objective learning problem, where the goal is to learn a task objective from only a few example images of successful end states for that task.
5 code implementations • ICLR 2019 • Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine
By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.
no code implementations • 2 Oct 2018 • Suraj Nair, Mohammad Babaeizadeh, Chelsea Finn, Sergey Levine, Vikash Kumar
We test our method on the domain of assembly, specifically the mating of tetris-style block pairs.
1 code implementation • 2 Oct 2018 • Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song
Reinforcement learning algorithms struggle when the reward signal is very sparse.
7 code implementations • ICLR 2019 • Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine
We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning.
no code implementations • ICLR 2019 • Kyle Hsu, Sergey Levine, Chelsea Finn
A central goal of unsupervised learning is to acquire representations from unlabeled data or experience that can be used for more effective learning of downstream tasks from modest amounts of labeled data.
3 code implementations • 6 Oct 2018 • Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn
We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation.
1 code implementation • 8 Oct 2018 • Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine
In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV).
no code implementations • 14 Oct 2018 • Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar
Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.
1 code implementation • ICLR 2020 • Nicholas Rhinehart, Rowan Mcallister, Sergey Levine
Yet, reward functions that evoke desirable behavior are often difficult to specify.
1 code implementation • 16 Oct 2018 • Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine
We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks.
no code implementations • 25 Oct 2018 • Tianhe Yu, Pieter Abbeel, Sergey Levine, Chelsea Finn
We consider the problem of learning multi-stage vision-based tasks on a real robot from a single video of a human performing the task, while leveraging demonstration data of subtasks with other objects.
1 code implementation • 16 Nov 2018 • Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine
We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin.
1 code implementation • 19 Nov 2018 • Dibya Ghosh, Abhishek Gupta, Sergey Levine
Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.
1 code implementation • ICLR 2019 • John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine
However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.
1 code implementation • 3 Dec 2018 • Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine
Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains.
no code implementations • NeurIPS 2018 • Ashish Kumar, Saurabh Gupta, David Fouhey, Sergey Levine, Jitendra Malik
Equipped with this abstraction, a second network observes the world and decides how to act to retrace the path under noisy actuation and a changing environment.
no code implementations • 7 Dec 2018 • Tobias Johannink, Shikhar Bahl, Ashvin Nair, Jianlan Luo, Avinash Kumar, Matthias Loskyll, Juan Aparicio Ojea, Eugen Solowjow, Sergey Levine
In this paper, we study how we can solve difficult control problems in the real world by decomposing them into a part that is solved efficiently by conventional feedback control methods, and the residual which is solved with RL.
50 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
no code implementations • CVPR 2019 • Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, Konstantinos Bousmalis
Using domain adaptation methods to cross this "reality gap" requires a large amount of unlabelled real-world data, whilst domain randomization alone can waste modeling power.
no code implementations • ICLR 2019 • Anusha Nagabandi, Chelsea Finn, Sergey Levine
The goal in this paper is to develop a method for continual online learning from an incoming stream of data, using deep neural network models.
no code implementations • 26 Dec 2018 • Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine
In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.
no code implementations • 27 Dec 2018 • Rowan McAllister, Gregory Kahn, Jeff Clune, Sergey Levine
Our method estimates an uncertainty measure about the model's prediction, taking into account an explicit (generative) model of the observation distribution to handle out-of-distribution inputs.
no code implementations • 28 Dec 2018 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.
no code implementations • 11 Jan 2019 • Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Roberto Calandra, Sergey Levine, Kristofer S. J. Pister
Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 30 Jan 2019 • Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine
In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
1 code implementation • 7 Feb 2019 • Łukasz Kidziński, Carmichael Ong, Sharada Prasanna Mohanty, Jennifer Hicks, Sean F. Carroll, Bo Zhou, Hongsheng Zeng, Fan Wang, Rongzhong Lian, Hao Tian, Wojciech Jaśkowski, Garrett Andersen, Odd Rune Lykkebø, Nihat Engin Toklu, Pranav Shyam, Rupesh Kumar Srivastava, Sergey Kolesnikov, Oleksii Hrinchuk, Anton Pechenko, Mattias Ljungström, Zhen Wang, Xu Hu, Zehong Hu, Minghui Qiu, Jun Huang, Aleksei Shpilman, Ivan Sosin, Oleg Svidchenko, Aleksandra Malysheva, Daniel Kudenko, Lance Rane, Aditya Bhatt, Zhengfei Wang, Penghui Qi, Zeyang Yu, Peng Peng, Quan Yuan, Wenxin Li, Yunsheng Tian, Ruihan Yang, Pingchuan Ma, Shauharda Khadka, Somdeb Majumdar, Zach Dwiel, Yinyin Liu, Evren Tumer, Jeremy Watson, Marcel Salathé, Sergey Levine, Scott Delp
In the NeurIPS 2018 Artificial Intelligence for Prosthetics challenge, participants were tasked with building a controller for a musculoskeletal model with a goal of matching a given time-varying velocity vector.
1 code implementation • 11 Feb 2019 • Katie Kang, Suneel Belkhale, Gregory Kahn, Pieter Abbeel, Sergey Levine
Deep reinforcement learning provides a promising approach for vision-based control of real-world robots.
no code implementations • ICLR 2019 • Justin Fu, Anoop Korattikara, Sergey Levine, Sergio Guadarrama
In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments.
no code implementations • ICLR Workshop LLD 2019 • Chelsea Finn, Aravind Rajeswaran, Sham Kakade, Sergey Levine
Meta-learning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the set of tasks are available together as a batch.
1 code implementation • 26 Feb 2019 • Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine
Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).
2 code implementations • 1 Mar 2019 • Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
Ranked #12 on Atari Games 100k on Atari 100k
1 code implementation • ICLR 2020 • Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma
Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.
Ranked #15 on Video Generation on BAIR Robot Pushing
1 code implementation • 5 Mar 2019 • Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet
Learning from play (LfP) offers three main advantages: 1) It is cheap.
Robotics
2 code implementations • ICML 2020 • Vitchyr H. Pong, Murtaza Dalal, Steven Lin, Ashvin Nair, Shikhar Bahl, Sergey Levine
Autonomous agents that must exhibit flexible and broad capabilities will need to be equipped with large repertoires of skills.
no code implementations • 8 Mar 2019 • Justin Lin, Roberto Calandra, Sergey Levine
We propose a novel framing of the problem as multi-modal recognition: the goal of our system is to recognize, given a visual and tactile observation, whether or not these observations correspond to the same object.
no code implementations • 11 Mar 2019 • Stephen Tian, Frederik Ebert, Dinesh Jayaraman, Mayur Mudigonda, Chelsea Finn, Roberto Calandra, Sergey Levine
Touch sensing is widely acknowledged to be important for dexterous robotic manipulation, but exploiting tactile sensing for continuous, non-prehensile manipulation is challenging.
7 code implementations • ICLR Workshop LLD 2019 • Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
no code implementations • NeurIPS 2019 • Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet
Mutual information maximization has emerged as a powerful learning objective for unsupervised representation learning obtaining state-of-the-art performance in applications such as object recognition, speech recognition, and reinforcement learning.
no code implementations • NeurIPS 2019 • Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn
Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.
no code implementations • 11 Apr 2019 • Annie Xie, Frederik Ebert, Sergey Levine, Chelsea Finn
Machine learning techniques have enabled robots to learn narrow, yet complex tasks and also perform broad, yet simple skills with a wide variety of objects.
3 code implementations • 16 Apr 2019 • Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine
In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.
no code implementations • ICLR 2019 • Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine
In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.
no code implementations • ICLR 2019 • Dibya Ghosh, Abhishek Gupta, Sergey Levine
Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.
no code implementations • ICLR 2019 • Kate Rakelly*, Evan Shelhamer*, Trevor Darrell, Alexei A. Efros, Sergey Levine
To explore generalization, we analyze guidance as a bridge between different levels of supervision to segment classes as the union of instances.
no code implementations • ICLR 2019 • Rosen Kralev, Russell Mendonca, Alvin Zhang, Tianhe Yu, Abhishek Gupta, Pieter Abbeel, Sergey Levine, Chelsea Finn
Meta-reinforcement learning aims to learn fast reinforcement learning (RL) procedures that can be applied to new tasks or environments.
no code implementations • ICLR 2019 • Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn
A significant challenge for the practical application of reinforcement learning toreal world problems is the need to specify an oracle reward function that correctly defines a task.
no code implementations • ICLR 2019 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.
2 code implementations • ICCV 2019 • Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine
For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.
1 code implementation • 3 May 2019 • Thomas Liao, Grant Wang, Brian Yang, Rene Lee, Kristofer Pister, Sergey Levine, Roberto Calandra
Robot design is often a slow and difficult process requiring the iterative construction and testing of prototypes, with the goal of sequentially optimizing the design.
no code implementations • 17 May 2019 • Brian Yang, Jesse Zhang, Vitchyr Pong, Sergey Levine, Dinesh Jayaraman
We envision REPLAB as a framework for reproducible research across manipulation tasks, and as a step in this direction, we define a template for a grasping benchmark consisting of a task definition, evaluation protocol, performance measures, and a dataset of 92k grasp attempts.
1 code implementation • NeurIPS 2019 • Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine
In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.
2 code implementations • ICLR 2020 • Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.
5 code implementations • ICLR 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine
Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.
2 code implementations • NeurIPS 2019 • Pim de Haan, Dinesh Jayaraman, Sergey Levine
Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment.
no code implementations • 31 May 2019 • Brijen Thananjeyan, Ashwin Balakrishna, Ugo Rosolia, Felix Li, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine, Francesco Borrelli, Ken Goldberg
Reinforcement learning (RL) for robotics is challenging due to the difficulty in hand-engineering a dense cost function, which can lead to unintended behavior, and dynamical uncertainty, which makes exploration and constraint satisfaction challenging.
Model-based Reinforcement Learning reinforcement-learning +1
3 code implementations • NeurIPS 2019 • Aviral Kumar, Justin Fu, George Tucker, Sergey Levine
Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.
no code implementations • NeurIPS 2019 • Alex Irpan, Kanishka Rao, Konstantinos Bousmalis, Chris Harris, Julian Ibarz, Sergey Levine
However, for high-dimensional observations, such as images, models of the environment can be difficult to fit and value-based methods can make IS hard to use or even ill-conditioned, especially when dealing with continuous action spaces.
no code implementations • 7 Jun 2019 • Allan Zhou, Eric Jang, Daniel Kappler, Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn
Imitation learning allows agents to learn complex behaviors from demonstrations.
1 code implementation • 11 Jun 2019 • Shagun Sodhani, Anirudh Goyal, Tristan Deleu, Yoshua Bengio, Sergey Levine, Jian Tang
There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment.
1 code implementation • 12 Jun 2019 • Lisa Lee, Benjamin Eysenbach, Emilio Parisotto, Eric Xing, Sergey Levine, Ruslan Salakhutdinov
The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.
1 code implementation • NeurIPS 2019 • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine
We introduce a general control algorithm that combines the strengths of planning and reinforcement learning to effectively solve these tasks.
1 code implementation • 13 Jun 2019 • Gerrit Schoettler, Ashvin Nair, Jianlan Luo, Shikhar Bahl, Juan Aparicio Ojea, Eugen Solowjow, Sergey Levine
Connector insertion and many other tasks commonly found in modern manufacturing settings involve complex contact dynamics and friction.
11 code implementations • NeurIPS 2019 • Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine
Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • ICLR 2020 • Anirudh Goyal, Shagun Sodhani, Jonathan Binas, Xue Bin Peng, Sergey Levine, Yoshua Bengio
Reinforcement learning agents that operate in diverse and complex environments can benefit from the structured decomposition of their behavior.
Hierarchical Reinforcement Learning reinforcement-learning +1
8 code implementations • NeurIPS 2020 • Alex X. Lee, Anusha Nagabandi, Pieter Abbeel, Sergey Levine
Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations.
3 code implementations • 2 Jul 2019 • Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman
Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.
no code implementations • ICLR 2020 • Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine
We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples.
6 code implementations • NeurIPS 2019 • Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine
By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.
no code implementations • 22 Sep 2019 • Gokul Swamy, Siddharth Reddy, Sergey Levine, Anca D. Dragan
We learn a model of the user's preferences from observations of the user's choices in easy settings with a few robots, and use it in challenging settings with more robots to automatically identify which robot the user would most likely choose to control, if they were able to evaluate the states of all robots at all times.
no code implementations • 23 Sep 2019 • Ofir Nachum, Haoran Tang, Xingyu Lu, Shixiang Gu, Honglak Lee, Sergey Levine
Hierarchical reinforcement learning has demonstrated significant success at solving difficult reinforcement learning (RL) tasks.
Hierarchical Reinforcement Learning reinforcement-learning +1
3 code implementations • ICLR 2021 • Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf
Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes.
no code implementations • 25 Sep 2019 • Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine
Reinforcement learning algorithms can acquire policies for complex tasks automatically, however the number of samples required to learn a diverse set of skills can be prohibitively large.