no code implementations • 23 Sep 2015 • Justin Fu, Sergey Levine, Pieter Abbeel
One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand.
Model-based Reinforcement Learning Model Predictive Control +3
no code implementations • 1 Dec 2016 • Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine
We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available.
1 code implementation • NeurIPS 2017 • Justin Fu, John D. Co-Reyes, Sergey Levine
Deep reinforcement learning algorithms have been shown to learn complex tasks using highly general policy classes.
7 code implementations • 30 Oct 2017 • Justin Fu, Katie Luo, Sergey Levine
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.
Ranked #3 on MuJoCo Games on Ant
no code implementations • ICLR 2018 • Justin Fu, Katie Luo, Sergey Levine
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.
no code implementations • NeurIPS 2018 • Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine
We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.
no code implementations • ICLR 2019 • Justin Fu, Anoop Korattikara, Sergey Levine, Sergio Guadarrama
In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments.
1 code implementation • 26 Feb 2019 • Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine
Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).
3 code implementations • NeurIPS 2019 • Aviral Kumar, Justin Fu, George Tucker, Sergey Levine
Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.
11 code implementations • NeurIPS 2019 • Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine
Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 25 Sep 2019 • Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine
By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.
2 code implementations • ICLR 2021 • Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine
Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.
Multi-Goal Reinforcement Learning Reinforcement Learning (RL)
7 code implementations • 15 Apr 2020 • Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine
In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
3 code implementations • 4 May 2020 • Sergey Levine, Aviral Kumar, George Tucker, Justin Fu
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection.
no code implementations • ICLR 2021 • Justin Fu, Sergey Levine
We propose to tackle this problem by leveraging the normalized maximum-likelihood (NML) estimator, which provides a principled approach to handling uncertainty and out-of-distribution inputs.
3 code implementations • ICLR 2021 • Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine
Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.
no code implementations • Findings (NAACL) 2022 • Charlie Snell, Mengjiao Yang, Justin Fu, Yi Su, Sergey Levine
Goal-oriented dialogue systems face a trade-off between fluent language generation and task-specific control.
2 code implementations • NAACL 2022 • Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine
Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.
no code implementations • 18 Oct 2022 • Eli Bronstein, Mark Palatucci, Dominik Notz, Brandyn White, Alex Kuefler, Yiren Lu, Supratik Paul, Payam Nikdel, Paul Mougin, Hongge Chen, Justin Fu, Austin Abrams, Punit Shah, Evan Racah, Benjamin Frenkel, Shimon Whiteson, Dragomir Anguelov
We demonstrate the first large-scale application of model-based generative adversarial imitation learning (MGAIL) to the task of dense urban self-driving.
no code implementations • 21 Dec 2022 • Yiren Lu, Justin Fu, George Tucker, Xinlei Pan, Eli Bronstein, Rebecca Roelofs, Benjamin Sapp, Brandyn White, Aleksandra Faust, Shimon Whiteson, Dragomir Anguelov, Sergey Levine
To our knowledge, this is the first application of a combined imitation and reinforcement learning approach in autonomous driving that utilizes large amounts of real-world human driving data.