Search Results for author: Anikait Singh

Found 6 papers, 3 papers with code

Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning

1 code implementation9 Mar 2023 Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine

Our approach, calibrated Q-learning (Cal-QL) accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.

Offline RL Q-Learning +1

Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints

no code implementations2 Nov 2022 Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine

Both theoretically and empirically, we show that typical offline RL methods, which are based on distribution constraints fail to learn from data with such non-uniform variability, due to the requirement to stay close to the behavior policy to the same extent across the state space.

Atari Games Offline RL +2

Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials

1 code implementation11 Oct 2022 Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine

To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.

Offline RL Q-Learning +1

When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?

no code implementations12 Apr 2022 Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

To answer this question, we characterize the properties of environments that allow offline RL methods to perform better than BC methods, even when only provided with expert data.

Atari Games Imitation Learning +3

Should I Run Offline Reinforcement Learning or Behavioral Cloning?

no code implementations ICLR 2022 Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine

In this paper, our goal is to characterize environments and dataset compositions where offline RL leads to better performance than BC.

Atari Games Offline RL +3

A Workflow for Offline Model-Free Robotic Reinforcement Learning

1 code implementation22 Sep 2021 Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine

To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.

Offline RL reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.