Search Results for author: Justin Fu

Found 18 papers, 9 papers with code

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning

no code implementations18 Apr 2022 Siddharth Verma, Justin Fu, Mengjiao Yang, Sergey Levine

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses with similar statistical properties.

Chatbot Offline RL +1

Benchmarks for Deep Off-Policy Evaluation

3 code implementations ICLR 2021 Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making.

Continuous Control Decision Making +1

Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation

no code implementations ICLR 2021 Justin Fu, Sergey Levine

We propose to tackle this problem by leveraging the normalized maximum-likelihood (NML) estimator, which provides a principled approach to handling uncertainty and out-of-distribution inputs.

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

3 code implementations4 May 2020 Sergey Levine, Aviral Kumar, George Tucker, Justin Fu

In this tutorial article, we aim to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcement learning algorithms that utilize previously collected data, without additional online data collection.

Decision Making reinforcement-learning

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

4 code implementations15 Apr 2020 Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine

In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.

Offline RL reinforcement-learning

Learning to Reach Goals via Iterated Supervised Learning

2 code implementations ICLR 2021 Dibya Ghosh, Abhishek Gupta, Ashwin Reddy, Justin Fu, Coline Devin, Benjamin Eysenbach, Sergey Levine

Current reinforcement learning (RL) algorithms can be brittle and difficult to use, especially when learning goal-reaching behaviors from sparse rewards.

Multi-Goal Reinforcement Learning

Learning to Reach Goals Without Reinforcement Learning

no code implementations25 Sep 2019 Dibya Ghosh, Abhishek Gupta, Justin Fu, Ashwin Reddy, Coline Devin, Benjamin Eysenbach, Sergey Levine

By maximizing the likelihood of good actions provided by an expert demonstrator, supervised imitation learning can produce effective policies without the algorithmic complexities and optimization challenges of reinforcement learning, at the cost of requiring an expert demonstrator -- typically a person -- to provide the demonstrations.

Imitation Learning reinforcement-learning

When to Trust Your Model: Model-Based Policy Optimization

10 code implementations NeurIPS 2019 Michael Janner, Justin Fu, Marvin Zhang, Sergey Levine

Designing effective model-based reinforcement learning algorithms is difficult because the ease of data generation must be weighed against the bias of model-generated data.

Model-based Reinforcement Learning reinforcement-learning

Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction

2 code implementations NeurIPS 2019 Aviral Kumar, Justin Fu, George Tucker, Sergey Levine

Bootstrapping error is due to bootstrapping from actions that lie outside of the training data distribution, and it accumulates via the Bellman backup operator.

Continuous Control Q-Learning

Diagnosing Bottlenecks in Deep Q-learning Algorithms

1 code implementation26 Feb 2019 Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine

Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL).

Continuous Control Q-Learning +1

From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following

no code implementations ICLR 2019 Justin Fu, Anoop Korattikara, Sergey Levine, Sergio Guadarrama

In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments.


Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

no code implementations NeurIPS 2018 Justin Fu, Avi Singh, Dibya Ghosh, Larry Yang, Sergey Levine

We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available.

Continuous Control reinforcement-learning

Learning Robust Rewards with Adverserial Inverse Reinforcement Learning

no code implementations ICLR 2018 Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Decision Making Imitation Learning +1

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

6 code implementations30 Oct 2017 Justin Fu, Katie Luo, Sergey Levine

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering.

Decision Making reinforcement-learning

Generalizing Skills with Semi-Supervised Reinforcement Learning

no code implementations1 Dec 2016 Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine

We evaluate our method on challenging tasks that require control directly from images, and show that our approach can improve the generalization of a learned deep neural network policy by using experience for which no reward function is available.


One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors

no code implementations23 Sep 2015 Justin Fu, Sergey Levine, Pieter Abbeel

One of the key challenges in applying reinforcement learning to complex robotic control tasks is the need to gather large amounts of experience in order to find an effective policy for the task at hand.

Model-based Reinforcement Learning One-Shot Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.