Search Results for author: Sergey Levine

Found 491 papers, 228 papers with code

Feature Construction for Inverse Reinforcement Learning

no code implementations NeurIPS 2010 Sergey Levine, Zoran Popovic, Vladlen Koltun

The goal of inverse reinforcement learning is to find a reward function for a Markov decision process, given example traces from its optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Exploring Deep and Recurrent Architectures for Optimal Control

no code implementations7 Nov 2013 Sergey Levine

In this paper, we explore the application of deep and recurrent neural networks to a continuous, high-dimensional locomotion task, where the network is used to represent a control policy that maps the state of the system (represented by joint angles) directly to the torques at each joint.

Variational Policy Search via Trajectory Optimization

no code implementations NeurIPS 2013 Sergey Levine, Vladlen Koltun

In order to learn effective control policies for dynamical systems, policy search methods must be able to discover successful executions of the desired task.

Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics

no code implementations NeurIPS 2014 Sergey Levine, Pieter Abbeel

We present a policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems.

Learning Contact-Rich Manipulation Skills with Guided Policy Search

no code implementations22 Jan 2015 Sergey Levine, Nolan Wagener, Pieter Abbeel

Autonomous learning of object manipulation skills can enable robots to acquire rich behavioral repertoires that scale to the variety of objects found in the real world.

Robotics

Trust Region Policy Optimization

21 code implementations19 Feb 2015 John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel

We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.

Atari Games Policy Gradient Methods

End-to-End Training of Deep Visuomotor Policies

no code implementations2 Apr 2015 Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel

Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control.

High-Dimensional Continuous Control Using Generalized Advantage Estimation

17 code implementations8 Jun 2015 John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.

Continuous Control Policy Gradient Methods +1

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

1 code implementation3 Jul 2015 Bradly C. Stadie, Sergey Levine, Pieter Abbeel

By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces.

Atari Games reinforcement-learning +2

Learning Deep Neural Network Policies with Continuous Memory States

no code implementations5 Jul 2015 Marvin Zhang, Zoe McCarthy, Chelsea Finn, Sergey Levine, Pieter Abbeel

We evaluate our method on tasks involving continuous control in manipulation and navigation settings, and show that our method can learn complex policies that successfully complete a range of tasks that require memory.

Continuous Control Memorization

Recurrent Network Models for Human Dynamics

no code implementations ICCV 2015 Katerina Fragkiadaki, Sergey Levine, Panna Felsen, Jitendra Malik

We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture.

Ranked #8 on Human Pose Forecasting on Human3.6M (MAR, walking, 1,000ms metric)

Human Dynamics Human Pose Forecasting +2

Deep Spatial Autoencoders for Visuomotor Learning

1 code implementation21 Sep 2015 Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel

Our method uses a deep spatial autoencoder to acquire a set of feature points that describe the environment for the current task, such as the positions of objects, and then learns a motion skill with these feature points using an efficient reinforcement learning method based on local linear models.

reinforcement-learning Reinforcement Learning (RL)

Learning Deep Control Policies for Autonomous Aerial Vehicles with MPC-Guided Policy Search

no code implementations22 Sep 2015 Tianhao Zhang, Gregory Kahn, Sergey Levine, Pieter Abbeel

We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment.

Model Predictive Control reinforcement-learning +1

One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors

no code implementations23 Sep 2015 Justin Fu, Sergey Levine, Pieter Abbeel

One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand.

Model-based Reinforcement Learning Model Predictive Control +3

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

2 code implementations16 Nov 2015 Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

no code implementations23 Nov 2015 Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell

We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains.

Domain Adaptation

Learning Visual Predictive Models of Physics for Playing Billiards

no code implementations23 Nov 2015 Katerina Fragkiadaki, Pulkit Agrawal, Sergey Levine, Jitendra Malik

The ability to plan and execute goal specific actions in varied, unexpected settings is a central requirement of intelligent agents.

Value Iteration Networks

8 code implementations NeurIPS 2016 Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.

reinforcement-learning Reinforcement Learning (RL)

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

4 code implementations1 Mar 2016 Chelsea Finn, Sergey Levine, Pieter Abbeel

We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems.

Feature Engineering

Continuous Deep Q-Learning with Model-based Acceleration

8 code implementations2 Mar 2016 Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.

Continuous Control Q-Learning +2

PLATO: Policy Learning using Adaptive Trajectory Optimization

no code implementations2 Mar 2016 Gregory Kahn, Tianhao Zhang, Sergey Levine, Pieter Abbeel

PLATO also maintains the MPC cost as an objective to avoid highly undesirable actions that would result from strictly following the learned policy before it has been fully trained.

Model Predictive Control

Learning Dexterous Manipulation for a Soft Robotic Hand from Human Demonstration

no code implementations21 Mar 2016 Abhishek Gupta, Clemens Eppner, Sergey Levine, Pieter Abbeel

In this paper, we describe an approach to learning from demonstration that can be used to train soft robotic hands to perform dexterous manipulation tasks.

Backprop KF: Learning Discriminative Deterministic State Estimators

1 code implementation NeurIPS 2016 Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.

Autonomous Vehicles Visual Odometry

Unsupervised Learning for Physical Interaction through Video Prediction

3 code implementations NeurIPS 2016 Chelsea Finn, Ian Goodfellow, Sergey Levine

A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment.

Object Video Generation +1

Guided Policy Search as Approximate Mirror Descent

1 code implementation15 Jul 2016 William Montgomery, Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

no code implementations22 Sep 2016 Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.

reinforcement-learning Reinforcement Learning (RL) +2

Learning from the Hindsight Plan -- Episodic MPC Improvement

1 code implementation28 Sep 2016 Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel

To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.

Model Predictive Control

Deep Reinforcement Learning for Tensegrity Robot Locomotion

no code implementations28 Sep 2016 Marvin Zhang, Xinyang Geng, Jonathan Bruce, Ken Caluwaerts, Massimo Vespignani, Vytas SunSpiral, Pieter Abbeel, Sergey Levine

We evaluate our method with real-world and simulated experiments on the SUPERball tensegrity robot, showing that the learned policies generalize to changes in system parameters, unreliable sensor measurements, and variation in environmental conditions, including varied terrains and a range of different gravities.

reinforcement-learning Reinforcement Learning (RL)

Path Integral Guided Policy Search

no code implementations3 Oct 2016 Yevgen Chebotar, Mrinal Kalakrishnan, Ali Yahya, Adrian Li, Stefan Schaal, Sergey Levine

We extend GPS in the following ways: (1) we propose the use of a model-free local optimizer based on path integral stochastic optimal control (PI2), which enables us to learn local policies for tasks with highly discontinuous contact dynamics; and (2) we enable GPS to train on a new set of task instances in every iteration by using on-policy sampling: this increases the diversity of the instances that the policy is trained on, and is crucial for achieving good generalization.

Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search

no code implementations3 Oct 2016 Ali Yahya, Adrian Li, Mrinal Kalakrishnan, Yevgen Chebotar, Sergey Levine

In this work, we explore distributed and asynchronous policy learning as a means to achieve generalization and improved training times on challenging, real-world manipulation tasks.

reinforcement-learning Reinforcement Learning (RL)

Deep Visual Foresight for Planning Robot Motion

1 code implementation3 Oct 2016 Chelsea Finn, Sergey Levine

A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback.

Model-based Reinforcement Learning Model Predictive Control +2

Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

no code implementations3 Oct 2016 Shixiang Gu, Ethan Holly, Timothy Lillicrap, Sergey Levine

In this paper, we demonstrate that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

reinforcement-learning Reinforcement Learning (RL)

Reset-Free Guided Policy Search: Efficient Deep Reinforcement Learning with Stochastic Initial States

no code implementations4 Oct 2016 William Montgomery, Anurag Ajay, Chelsea Finn, Pieter Abbeel, Sergey Levine

Autonomous learning of robotic skills can allow general-purpose robots to learn wide behavioral repertoires without requiring extensive manual engineering.

reinforcement-learning Reinforcement Learning (RL)

EPOpt: Learning Robust Neural Network Policies Using Model Ensembles

no code implementations5 Oct 2016 Aravind Rajeswaran, Sarvjeet Ghotra, Balaraman Ravindran, Sergey Levine

Sample complexity and safety are major challenges when learning policies with reinforcement learning for real-world tasks, especially when the policies are represented using rich function approximators like deep neural networks.

Domain Adaptation

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

2 code implementations7 Nov 2016 Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine

We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.

Continuous Control Policy Gradient Methods +2

A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models

3 code implementations11 Nov 2016 Chelsea Finn, Paul Christiano, Pieter Abbeel, Sergey Levine

In particular, we demonstrate an equivalence between a sample-based algorithm for maximum entropy IRL and a GAN in which the generator's density can be evaluated and is provided as an additional input to the discriminator.

Imitation Learning reinforcement-learning +1

CAD2RL: Real Single-Image Flight without a Single Real Image

1 code implementation13 Nov 2016 Fereshteh Sadeghi, Sergey Levine

We propose a learning method that we call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models.

Collision Avoidance Depth Estimation +2

Learning Dexterous Manipulation Policies from Experience and Imitation

no code implementations15 Nov 2016 Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine

We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state.

Guided Policy Search via Approximate Mirror Descent

no code implementations NeurIPS 2016 William H. Montgomery, Sergey Levine

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space.

Generalizing Skills with Semi-Supervised Reinforcement Learning

no code implementations1 Dec 2016 Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine

We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available.

reinforcement-learning Reinforcement Learning (RL)

Unsupervised Perceptual Rewards for Imitation Learning

no code implementations20 Dec 2016 Pierre Sermanet, Kelvin Xu, Sergey Levine

We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.

Imitation Learning Reinforcement Learning (RL)

Uncertainty-Aware Reinforcement Learning for Collision Avoidance

no code implementations3 Feb 2017 Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, Sergey Levine

However, practical deployment of reinforcement learning methods must contend with the fact that the training process itself can be unsafe for the robot.

Collision Avoidance Navigate +2

Reinforcement Learning with Deep Energy-Based Policies

3 code implementations ICML 2017 Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +1

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

82 code implementations ICML 2017 Chelsea Finn, Pieter Abbeel, Sergey Levine

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning.

Few-Shot Image Classification General Classification +4

Goal-Driven Dynamics Learning via Bayesian Optimization

no code implementations27 Mar 2017 Somil Bansal, Roberto Calandra, Ted Xiao, Sergey Levine, Claire J. Tomlin

Real-world robots are becoming increasingly complex and commonly act in poorly understood environments where it is extremely challenging to model or learn their true dynamics.

Active Learning Bayesian Optimization

Learning Visual Servoing with Deep Features and Fitted Q-Iteration

2 code implementations31 Mar 2017 Alex X. Lee, Sergey Levine, Pieter Abbeel

Our approach is based on servoing the camera in the space of learned visual features, rather than image pixels or manually-designed keypoints.

reinforcement-learning Reinforcement Learning (RL)

Time-Contrastive Networks: Self-Supervised Learning from Video

7 code implementations23 Apr 2017 Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine

While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.

Metric Learning reinforcement-learning +3

End-to-End Learning of Semantic Grasping

no code implementations6 Jul 2017 Eric Jang, Sudheendra Vijayanarasimhan, Peter Pastor, Julian Ibarz, Sergey Levine

We consider the task of semantic robotic grasping, in which a robot picks up an object of a user-specified class using only monocular images.

Object object-detection +3

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

1 code implementation10 Jul 2017 Rouhollah Rahmatizadeh, Pooya Abolghasemi, Ladislau Bölöni, Sergey Levine

We propose a technique for multi-task learning from demonstration that trains the controller of a low-cost robotic arm to accomplish several complex picking and placing tasks, as well as non-prehensile manipulation.

Multi-Task Learning Position

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

1 code implementation11 Jul 2017 YuXuan Liu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator.

Imitation Learning Translation +1

GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images

no code implementations ICCV 2017 Avi Singh, Larry Yang, Sergey Levine

We show that pairing interaction data from just a single environment with a diverse dataset of weakly labeled data results in greatly improved generalization to unseen environments, and show that this generalization depends on both the auxiliary objective and the attentional architecture that we propose.

Binary Classification Domain Adaptation

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

8 code implementations8 Aug 2017 Anusha Nagabandi, Gregory Kahn, Ronald S. Fearing, Sergey Levine

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.

Model-based Reinforcement Learning Model Predictive Control +2

Deep Object-Centric Representations for Generalizable Robot Learning

1 code implementation14 Aug 2017 Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine

We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.

Object Reinforcement Learning (RL)

One-Shot Visual Imitation Learning via Meta-Learning

3 code implementations14 Sep 2017 Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, Sergey Levine

In this work, we present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration.

Imitation Learning Meta-Learning

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

1 code implementation22 Sep 2017 Konstantinos Bousmalis, Alex Irpan, Paul Wohlhart, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke

We extensively evaluate our approaches with a total of more than 25, 000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN.

Domain Adaptation Industrial Robots +1

Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

2 code implementations29 Sep 2017 Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine

To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.

Navigate Q-Learning +3

Self-Supervised Visual Planning with Temporal Skip Connections

3 code implementations15 Oct 2017 Frederik Ebert, Chelsea Finn, Alex X. Lee, Sergey Levine

One learning signal that is always available for autonomously collected data is prediction: if a robot can learn to predict the future, it can use this predictive model to take actions to produce desired outcomes, such as moving an object to a particular location.

Video Prediction

The Feeling of Success: Does Touch Sensing Help Predict Grasp Outcomes?

1 code implementation16 Oct 2017 Roberto Calandra, Andrew Owens, Manu Upadhyaya, Wenzhen Yuan, Justin Lin, Edward H. Adelson, Sergey Levine

In this work, we investigate the question of whether touch sensing aids in predicting grasp outcomes within a multimodal sensing framework that combines vision and touch.

Industrial Robots Robotic Grasping

Stochastic Variational Video Prediction

3 code implementations ICLR 2018 Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, Sergey Levine

We find that our proposed method produces substantially improved video predictions when compared to the same model without stochasticity, and to other stochastic video prediction methods.

Video Generation Video Prediction

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

7 code implementations30 Oct 2017 Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Decision Making reinforcement-learning +1

Regret Minimization for Partially Observable Deep Reinforcement Learning

1 code implementation ICML 2018 Peter Jin, Kurt Keutzer, Sergey Levine

Deep reinforcement learning algorithms that estimate state and state-action value functions have been shown to be effective in a variety of challenging domains, including learning control strategies from raw image pixels.

counterfactual reinforcement-learning +1

Learning with Latent Language

1 code implementation NAACL 2018 Jacob Andreas, Dan Klein, Sergey Levine

The named concepts and compositional operators present in natural language provide a rich source of information about the kinds of abstractions humans use to navigate the world.

Image Classification Navigate

Learning Image-Conditioned Dynamics Models for Control of Under-actuated Legged Millirobots

no code implementations14 Nov 2017 Anusha Nagabandi, Guangzhao Yang, Thomas Asmar, Ravi Pandya, Gregory Kahn, Sergey Levine, Ronald S. Fearing

We present an approach for controlling a real-world legged millirobot that is based on learned neural network models.

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

1 code implementation ICLR 2018 Benjamin Eysenbach, Shixiang Gu, Julian Ibarz, Sergey Levine

In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt.

reinforcement-learning Reinforcement Learning (RL)

Divide-and-Conquer Reinforcement Learning

1 code implementation ICLR 2018 Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine

In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.

Policy Gradient Methods reinforcement-learning +1

Sim2Real View Invariant Visual Servoing by Recurrent Control

no code implementations20 Dec 2017 Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine

To this end, we train a deep recurrent controller that can automatically determine which actions move the end-point of a robotic arm to a desired object.

Unifying Map and Landmark Based Representations for Visual Navigation

no code implementations21 Dec 2017 Saurabh Gupta, David Fouhey, Sergey Levine, Jitendra Malik

This works presents a formulation for visual navigation that unifies map based spatial reasoning and path planning, with landmark based robust plan execution in noisy environments.

Navigate Visual Navigation

Self-Supervised Learning of Object Motion Through Adversarial Video Prediction

no code implementations ICLR 2018 Alex X. Lee, Frederik Ebert, Richard Zhang, Chelsea Finn, Pieter Abbeel, Sergey Levine

In this paper, we study the problem of multi-step video prediction, where the goal is to predict a sequence of future frames conditioned on a short context.

Object Self-Supervised Learning +1

Learning Robust Rewards with Adverserial Inverse Reinforcement Learning

no code implementations ICLR 2018 Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Imitation Learning reinforcement-learning +1

Combining Model-based and Model-free RL via Multi-step Control Variates

no code implementations ICLR 2018 Tong Che, Yuchen Lu, George Tucker, Surya Bhupatiraju, Shane Gu, Sergey Levine, Yoshua Bengio

Model-free deep reinforcement learning algorithms are able to successfully solve a wide range of continuous control tasks, but typically require many on-policy samples to achieve good performance.

Continuous Control OpenAI Gym

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

no code implementations ICLR 2018 Erin Grant, Chelsea Finn, Sergey Levine, Trevor Darrell, Thomas Griffiths

Meta-learning allows an intelligent agent to leverage prior learning episodes as a basis for quickly improving performance on a novel task.

Meta-Learning

Shared Autonomy via Deep Reinforcement Learning

1 code implementation6 Feb 2018 Siddharth Reddy, Anca D. Dragan, Sergey Levine

In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.

reinforcement-learning Reinforcement Learning (RL)

Reinforcement Learning from Imperfect Demonstrations

no code implementations ICLR 2018 Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell

We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.

reinforcement-learning Reinforcement Learning (RL)

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

no code implementations ICLR 2018 Vitchyr Pong, Shixiang Gu, Murtaza Dalal, Sergey Levine

TDMs combine the benefits of model-free and model-based RL: they leverage the rich information in state transitions to learn very efficiently, while still attaining asymptotic performance that exceeds that of direct model-based RL methods.

Continuous Control Q-Learning +1

The Mirage of Action-Dependent Baselines in Reinforcement Learning

1 code implementation ICML 2018 George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine

Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.

Policy Gradient Methods reinforcement-learning +1

Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

no code implementations28 Feb 2018 Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael. I. Jordan, Joseph E. Gonzalez, Sergey Levine

By enabling wider use of learned dynamics models within a model-free reinforcement learning algorithm, we improve value estimation, which, in turn, reduces the sample complexity of learning.

Continuous Control reinforcement-learning +1

Learning Flexible and Reusable Locomotion Primitives for a Microrobot

no code implementations1 Mar 2018 Brian Yang, Grant Wang, Roberto Calandra, Daniel Contreras, Sergey Levine, Kristofer Pister

This approach formalizes locomotion as a contextual policy search task to collect data, and subsequently uses that data to learn multi-objective locomotion primitives that can be used for planning.

Navigate

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation19 Mar 2018 Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning +1

Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning

2 code implementations ICLR 2019 Anusha Nagabandi, Ignasi Clavera, Simin Liu, Ronald S. Fearing, Pieter Abbeel, Sergey Levine, Chelsea Finn

Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.

Continuous Control Meta-Learning +5

Learning to Run challenge: Synthesizing physiologically accurate motion using deep reinforcement learning

no code implementations31 Mar 2018 Łukasz Kidziński, Sharada P. Mohanty, Carmichael Ong, Jennifer L. Hicks, Sean F. Carroll, Sergey Levine, Marcel Salathé, Scott L. Delp

Synthesizing physiologically-accurate human movement in a variety of conditions can help practitioners plan surgeries, design experiments, or prototype assistive devices in simulated environments, reducing time and costs and improving treatment outcomes.

Navigate reinforcement-learning +1

Universal Planning Networks

1 code implementation2 Apr 2018 Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.

Imitation Learning Representation Learning +1

Stochastic Adversarial Video Prediction

4 code implementations ICLR 2019 Alex X. Lee, Richard Zhang, Frederik Ebert, Pieter Abbeel, Chelsea Finn, Sergey Levine

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Representation Learning Video Generation +1

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

6 code implementations8 Apr 2018 Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel Van de Panne

We further explore a number of methods for integrating multiple clips into the learning process to develop multi-skilled agents capable of performing a rich repertoire of diverse skills.

Motion Synthesis reinforcement-learning +1

Latent Space Policies for Hierarchical Reinforcement Learning

no code implementations ICML 2018 Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.

Hierarchical Reinforcement Learning reinforcement-learning +1

Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review

2 code implementations2 May 2018 Sergey Levine

The framework of reinforcement learning or optimal control provides a mathematical formalization of intelligent decision making that is powerful and broadly applicable.

Decision Making reinforcement-learning +2

Data-Efficient Hierarchical Reinforcement Learning

12 code implementations NeurIPS 2018 Ofir Nachum, Shixiang Gu, Honglak Lee, Sergey Levine

In this paper, we study how we can develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.

Hierarchical Reinforcement Learning reinforcement-learning +1

Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

1 code implementation NeurIPS 2018 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Few-Shot Segmentation Propagation with Guided Networks

1 code implementation25 May 2018 Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine

Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.

Interactive Segmentation Segmentation +3

More Than a Feeling: Learning to Grasp and Regrasp using Vision and Touch

no code implementations28 May 2018 Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward H. Adelson, Sergey Levine

This model -- a deep, multimodal convolutional network -- predicts the outcome of a candidate grasp adjustment, and then executes a grasp by iteratively selecting the most promising actions.

Robotic Grasping

Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

no code implementations NeurIPS 2018 Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine

We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.

Continuous Control reinforcement-learning +1

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

10 code implementations NeurIPS 2018 Kurtland Chua, Roberto Calandra, Rowan Mcallister, Sergey Levine

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance.

Model-based Reinforcement Learning reinforcement-learning +1

Learning a Prior over Intent via Meta-Inverse Reinforcement Learning

no code implementations31 May 2018 Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

A significant challenge for the practical application of reinforcement learning in the real world is the need to specify an oracle reward function that correctly defines a task.

reinforcement-learning Reinforcement Learning (RL)

Sim2Real Viewpoint Invariant Visual Servoing by Recurrent Control

no code implementations CVPR 2018 Fereshteh Sadeghi, Alexander Toshev, Eric Jang, Sergey Levine

In robotics, this ability is referred to as visual servoing: moving a tool or end-point to a desired location using primarily visual feedback.

Robot Manipulation

Probabilistic Model-Agnostic Meta-Learning

1 code implementation NeurIPS 2018 Chelsea Finn, Kelvin Xu, Sergey Levine

However, a critical challenge in few-shot learning is task ambiguity: even when a powerful prior can be meta-learned from a large number of prior tasks, a small dataset for a new task can simply be too ambiguous to acquire a single model (e. g., a classifier) for that task that is accurate.

Active Learning Few-Shot Image Classification +1

Unsupervised Meta-Learning for Reinforcement Learning

no code implementations ICLR 2020 Abhishek Gupta, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

In the context of reinforcement learning, meta-learning algorithms acquire reinforcement learning procedures to solve new problems more efficiently by utilizing experience from prior tasks.

Meta-Learning Meta Reinforcement Learning +3

Learning Instance Segmentation by Interaction

1 code implementation21 Jun 2018 Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik

The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.

Instance Segmentation Segmentation +1

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control

1 code implementation ICML 2018 Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn

A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.

Imitation Learning

Visual Reinforcement Learning with Imagined Goals

2 code implementations NeurIPS 2018 Ashvin Nair, Vitchyr Pong, Murtaza Dalal, Shikhar Bahl, Steven Lin, Sergey Levine

For an autonomous agent to fulfill a wide range of user-specified goals at test time, it must be able to learn broadly applicable and general-purpose skill repertoires.

reinforcement-learning Reinforcement Learning (RL) +1

Automatically Composing Representation Transformations as a Means for Generalization

1 code implementation ICLR 2019 Michael B. Chang, Abhishek Gupta, Sergey Levine, Thomas L. Griffiths

A generally intelligent learner should generalize to more complex tasks than it has previously encountered, but the two common paradigms in machine learning -- either training a separate learner per task or training a single learner for all tasks -- both have difficulty with such generalization because they do not leverage the compositional structure of the task distribution.

Decision Making

SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning

1 code implementation ICLR 2019 Marvin Zhang, Sharad Vikram, Laura Smith, Pieter Abbeel, Matthew J. Johnson, Sergey Levine

Model-based reinforcement learning (RL) has proven to be a data efficient approach for learning control tasks but is difficult to utilize in domains with complex observations such as images.

Model-based Reinforcement Learning reinforcement-learning +1

Unsupervised Exploration with Deep Model-Based Reinforcement Learning

no code implementations27 Sep 2018 Kurtland Chua, Rowan Mcallister, Roberto Calandra, Sergey Levine

We show that both challenges can be addressed by representing model-uncertainty, which can both guide exploration in the unsupervised phase and ensure that the errors in the model are not exploited by the planner in the goal-directed phase.

Model-based Reinforcement Learning reinforcement-learning +1

EMI: Exploration with Mutual Information Maximizing State and Action Embeddings

no code implementations27 Sep 2018 HyoungSeok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Policy optimization struggles when the reward feedback signal is very sparse and essentially becomes a random search algorithm until the agent stumbles upon a rewarding or the goal state.

Continuous Control

What Would pi* Do?: Imitation Learning via Off-Policy Reinforcement Learning

no code implementations27 Sep 2018 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Learning to imitate expert actions given demonstrations containing image observations is a difficult problem in robotic control.

Imitation Learning Q-Learning +2

Few-Shot Goal Inference for Visuomotor Learning and Planning

no code implementations30 Sep 2018 Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn

To that end, we formulate the few-shot objective learning problem, where the goal is to learn a task objective from only a few example images of successful end states for that task.

reinforcement-learning Reinforcement Learning (RL) +1

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

5 code implementations ICLR 2019 Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine

By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.

Continuous Control Image Generation +1

Time Reversal as Self-Supervision

no code implementations2 Oct 2018 Suraj Nair, Mohammad Babaeizadeh, Chelsea Finn, Sergey Levine, Vikash Kumar

We test our method on the domain of assembly, specifically the mating of tetris-style block pairs.

Model Predictive Control

EMI: Exploration with Mutual Information

1 code implementation2 Oct 2018 Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Reinforcement learning algorithms struggle when the reward signal is very sparse.

Continuous Control Reinforcement Learning (RL)

Unsupervised Learning via Meta-Learning

no code implementations ICLR 2019 Kyle Hsu, Sergey Levine, Chelsea Finn

A central goal of unsupervised learning is to acquire representations from unlabeled data or experience that can be used for more effective learning of downstream tasks from modest amounts of labeled data.

Clustering Disentanglement +3

Robustness via Retrying: Closed-Loop Robotic Manipulation with Self-Supervised Learning

3 code implementations6 Oct 2018 Frederik Ebert, Sudeep Dasari, Alex X. Lee, Sergey Levine, Chelsea Finn

We demonstrate that this idea can be combined with a video-prediction based controller to enable complex behaviors to be learned from scratch using only raw visual inputs, including grasping, repositioning objects, and non-prehensile manipulation.

Image Registration Self-Supervised Learning +1

SFV: Reinforcement Learning of Physical Skills from Videos

1 code implementation8 Oct 2018 Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine

In this paper, we propose a method that enables physically simulated characters to learn skills from videos (SFV).

Pose Estimation reinforcement-learning +1

Dexterous Manipulation with Deep Reinforcement Learning: Efficient, General, and Low-Cost

no code implementations14 Oct 2018 Henry Zhu, Abhishek Gupta, Aravind Rajeswaran, Sergey Levine, Vikash Kumar

Dexterous multi-fingered robotic hands can perform a wide range of manipulation skills, making them an appealing component for general-purpose robotic manipulators.

reinforcement-learning Reinforcement Learning (RL)

Composable Action-Conditioned Predictors: Flexible Off-Policy Learning for Robot Navigation

1 code implementation16 Oct 2018 Gregory Kahn, Adam Villaflor, Pieter Abbeel, Sergey Levine

We show that a simulated robotic car and a real-world RC car can gather data and train fully autonomously without any human-provided labels beyond those needed to train the detectors, and then at test-time be able to accomplish a variety of different tasks.

Robot Navigation

One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks

no code implementations25 Oct 2018 Tianhe Yu, Pieter Abbeel, Sergey Levine, Chelsea Finn

We consider the problem of learning multi-stage vision-based tasks on a real robot from a single video of a human performing the task, while leveraging demonstration data of subtasks with other objects.

Imitation Learning

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

1 code implementation16 Nov 2018 Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine

We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin.

Object Representation Learning

Learning Actionable Representations with Goal-Conditioned Policies

1 code implementation19 Nov 2018 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Guiding Policies with Language via Meta-Learning

1 code implementation ICLR 2019 John D. Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, Jacob Andreas, John DeNero, Pieter Abbeel, Sergey Levine

However, a single instruction may be insufficient to fully communicate our intent or, even if it is, may be insufficient for an autonomous agent to actually understand how to perform the desired task.

Imitation Learning Instruction Following +1

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

1 code implementation3 Dec 2018 Frederik Ebert, Chelsea Finn, Sudeep Dasari, Annie Xie, Alex Lee, Sergey Levine

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains.

reinforcement-learning Reinforcement Learning (RL)

Visual Memory for Robust Path Following

no code implementations NeurIPS 2018 Ashish Kumar, Saurabh Gupta, David Fouhey, Sergey Levine, Jitendra Malik

Equipped with this abstraction, a second network observes the world and decides how to act to retrace the path under noisy actuation and a changing environment.

Residual Reinforcement Learning for Robot Control

no code implementations7 Dec 2018 Tobias Johannink, Shikhar Bahl, Ashvin Nair, Jianlan Luo, Avinash Kumar, Matthias Loskyll, Juan Aparicio Ojea, Eugen Solowjow, Sergey Levine

In this paper, we study how we can solve difficult control problems in the real world by decomposing them into a part that is solved efficiently by conventional feedback control methods, and the residual which is solved with RL.

Friction reinforcement-learning +1

Deep Online Learning via Meta-Learning: Continual Adaptation for Model-Based RL

no code implementations ICLR 2019 Anusha Nagabandi, Chelsea Finn, Sergey Levine

The goal in this paper is to develop a method for continual online learning from an incoming stream of data, using deep neural network models.

Meta-Learning Model-based Reinforcement Learning

Learning to Walk via Deep Reinforcement Learning

no code implementations26 Dec 2018 Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine

In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.

reinforcement-learning Reinforcement Learning (RL)

Robustness to Out-of-Distribution Inputs via Task-Aware Generative Uncertainty

no code implementations27 Dec 2018 Rowan McAllister, Gregory Kahn, Jeff Clune, Sergey Levine

Our method estimates an uncertainty measure about the model's prediction, taking into account an explicit (generative) model of the observation distribution to handle out-of-distribution inputs.

Low Level Control of a Quadrotor with Deep Model-Based Reinforcement Learning

no code implementations11 Jan 2019 Nathan O. Lambert, Daniel S. Drew, Joseph Yaconelli, Roberto Calandra, Sergey Levine, Kristofer S. J. Pister

Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times.

Model-based Reinforcement Learning reinforcement-learning +1

InfoBot: Transfer and Exploration via the Information Bottleneck

no code implementations30 Jan 2019 Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Matthew Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine

In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following

no code implementations ICLR 2019 Justin Fu, Anoop Korattikara, Sergey Levine, Sergio Guadarrama

In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments.

Instruction Following reinforcement-learning +1

Online Meta-Learning

no code implementations ICLR Workshop LLD 2019 Chelsea Finn, Aravind Rajeswaran, Sham Kakade, Sergey Levine

Meta-learning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the set of tasks are available together as a batch.

Meta-Learning

Diagnosing Bottlenecks in Deep Q-learning Algorithms

1 code implementation26 Feb 2019 Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).

Continuous Control Q-Learning +2

Model-Based Reinforcement Learning for Atari

2 code implementations1 Mar 2019 Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Atari Games 100k +4

VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation

1 code implementation ICLR 2020 Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.

Predict Future Video Frames Video Generation

Learning Latent Plans from Play

1 code implementation5 Mar 2019 Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet

Learning from play (LfP) offers three main advantages: 1) It is cheap.

Robotics

Learning to Identify Object Instances by Touch: Tactile Recognition via Multimodal Matching

no code implementations8 Mar 2019 Justin Lin, Roberto Calandra, Sergey Levine

We propose a novel framing of the problem as multi-modal recognition: the goal of our system is to recognize, given a visual and tactile observation, whether or not these observations correspond to the same object.

Object

Manipulation by Feel: Touch-Based Control with Deep Predictive Models

no code implementations11 Mar 2019 Stephen Tian, Frederik Ebert, Dinesh Jayaraman, Mayur Mudigonda, Chelsea Finn, Roberto Calandra, Sergey Levine

Touch sensing is widely acknowledged to be important for dexterous robotic manipulation, but exploiting tactile sensing for continuous, non-prehensile manipulation is challenging.

Wasserstein Dependency Measure for Representation Learning

no code implementations NeurIPS 2019 Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet

Mutual information maximization has emerged as a powerful learning objective for unsupervised representation learning obtaining state-of-the-art performance in applications such as object recognition, speech recognition, and reinforcement learning.

Object Recognition reinforcement-learning +5

Guided Meta-Policy Search

no code implementations NeurIPS 2019 Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn

Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.

Continuous Control Imitation Learning +4

Improvisation through Physical Understanding: Using Novel Objects as Tools with Visual Foresight

no code implementations11 Apr 2019 Annie Xie, Frederik Ebert, Sergey Levine, Chelsea Finn

Machine learning techniques have enabled robots to learn narrow, yet complex tasks and also perform broad, yet simple skills with a wide variety of objects.

Imitation Learning Self-Supervised Learning

End-to-End Robotic Reinforcement Learning without Reward Engineering

3 code implementations16 Apr 2019 Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine

In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.

reinforcement-learning Reinforcement Learning (RL)

Transfer and Exploration via the Information Bottleneck

no code implementations ICLR 2019 Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine

In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

Learning Actionable Representations with Goal Conditioned Policies

no code implementations ICLR 2019 Dibya Ghosh, Abhishek Gupta, Sergey Levine

Most prior work on representation learning has focused on generative approaches, learning representations that capture all the underlying factors of variation in the observation space in a more disentangled or well-ordered manner.

Decision Making Hierarchical Reinforcement Learning +3

Meta-Learning to Guide Segmentation

no code implementations ICLR 2019 Kate Rakelly*, Evan Shelhamer*, Trevor Darrell, Alexei A. Efros, Sergey Levine

To explore generalization, we analyze guidance as a bridge between different levels of supervision to segment classes as the union of instances.

Meta-Learning Segmentation

Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning

no code implementations ICLR 2019 Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn

A significant challenge for the practical application of reinforcement learning toreal world problems is the need to specify an oracle reward function that correctly defines a task.

reinforcement-learning Reinforcement Learning (RL)

PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings

2 code implementations ICCV 2019 Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine

For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.

Autonomous Vehicles

Data-efficient Learning of Morphology and Controller for a Microrobot

1 code implementation3 May 2019 Thomas Liao, Grant Wang, Brian Yang, Rene Lee, Kristofer Pister, Sergey Levine, Roberto Calandra

Robot design is often a slow and difficult process requiring the iterative construction and testing of prototypes, with the goal of sequentially optimizing the design.

Bayesian Optimization

REPLAB: A Reproducible Low-Cost Arm Benchmark Platform for Robotic Learning

no code implementations17 May 2019 Brian Yang, Jesse Zhang, Vitchyr Pong, Sergey Levine, Dinesh Jayaraman

We envision REPLAB as a framework for reproducible research across manipulation tasks, and as a step in this direction, we define a template for a grasping benchmark consisting of a task definition, evaluation protocol, performance measures, and a dataset of 92k grasp attempts.

Benchmarking Machine Translation +1

MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

1 code implementation NeurIPS 2019 Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine

In this work, we propose multiplicative compositional policies (MCP), a method for learning reusable motor skills that can be composed to produce a range of complex behaviors.

Continuous Control

Adversarial Policies: Attacking Deep Reinforcement Learning

2 code implementations ICLR 2020 Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine, Stuart Russell

Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers.

reinforcement-learning Reinforcement Learning (RL)

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

5 code implementations ICLR 2020 Siddharth Reddy, Anca D. Dragan, Sergey Levine

Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.

Imitation Learning Q-Learning +2

Causal Confusion in Imitation Learning

2 code implementations NeurIPS 2019 Pim de Haan, Dinesh Jayaraman, Sergey Levine

Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment.

Imitation Learning

Safety Augmented Value Estimation from Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks

no code implementations31 May 2019 Brijen Thananjeyan, Ashwin Balakrishna, Ugo Rosolia, Felix Li, Rowan Mcallister, Joseph E. Gonzalez, Sergey Levine, Francesco Borrelli, Ken Goldberg

Reinforcement learning (RL) for robotics is challenging due to the difficulty in hand-engineering a dense cost function, which can lead to unintended behavior, and dynamical uncertainty, which makes exploration and constraint satisfaction challenging.

Model-based Reinforcement Learning reinforcement-learning +1

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

3 code implementations NeurIPS 2019 Aviral Kumar, Justin Fu, George Tucker, Sergey Levine

Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.

Continuous Control Q-Learning

Off-Policy Evaluation via Off-Policy Classification

no code implementations NeurIPS 2019 Alex Irpan, Kanishka Rao, Konstantinos Bousmalis, Chris Harris, Julian Ibarz, Sergey Levine

However, for high-dimensional observations, such as images, models of the environment can be difficult to fit and value-based methods can make IS hard to use or even ill-conditioned, especially when dealing with continuous action spaces.

Classification General Classification +2

Learning Powerful Policies by Using Consistent Dynamics Model

1 code implementation11 Jun 2019 Shagun Sodhani, Anirudh Goyal, Tristan Deleu, Yoshua Bengio, Sergey Levine, Jian Tang

There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment.

Atari Games Model-based Reinforcement Learning +1

Efficient Exploration via State Marginal Matching

1 code implementation12 Jun 2019 Lisa Lee, Benjamin Eysenbach, Emilio Parisotto, Eric Xing, Sergey Levine, Ruslan Salakhutdinov

The SMM objective can be viewed as a two-player, zero-sum game between a state density model and a parametric policy, an idea that we use to build an algorithm for optimizing the SMM objective.

Efficient Exploration Unsupervised Reinforcement Learning

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

1 code implementation NeurIPS 2019 Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine

We introduce a general control algorithm that combines the strengths of planning and reinforcement learning to effectively solve these tasks.

reinforcement-learning Reinforcement Learning (RL)

When to Trust Your Model: Model-Based Policy Optimization

11 code implementations NeurIPS 2019 Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.

Model-based Reinforcement Learning reinforcement-learning +1

Dynamics-Aware Unsupervised Discovery of Skills

3 code implementations2 Jul 2019 Archit Sharma, Shixiang Gu, Sergey Levine, Vikash Kumar, Karol Hausman

Conventionally, model-based reinforcement learning (MBRL) aims to learn a global model for the dynamics of the environment.

Model-based Reinforcement Learning

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

no code implementations ICLR 2020 Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples.

reinforcement-learning Reinforcement Learning (RL)

Meta-Learning with Implicit Gradients

6 code implementations NeurIPS 2019 Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine

By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.

Few-Shot Image Classification Few-Shot Learning

Scaled Autonomy: Enabling Human Operators to Control Robot Fleets

no code implementations22 Sep 2019 Gokul Swamy, Siddharth Reddy, Sergey Levine, Anca D. Dragan

We learn a model of the user's preferences from observations of the user's choices in easy settings with a few robots, and use it in challenging settings with more robots to automatically identify which robot the user would most likely choose to control, if they were able to evaluate the states of all robots at all times.

Robot Navigation

Recurrent Independent Mechanisms

3 code implementations ICLR 2021 Anirudh Goyal, Alex Lamb, Jordan Hoffmann, Shagun Sodhani, Sergey Levine, Yoshua Bengio, Bernhard Schölkopf

Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes.

Consistent Meta-Reinforcement Learning via Model Identification and Experience Relabeling

no code implementations25 Sep 2019 Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine

Reinforcement learning algorithms can acquire policies for complex tasks automatically, however the number of samples required to learn a diverse set of skills can be prohibitively large.

Meta Reinforcement Learning reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.