Offline RL

46 papers with code • 0 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Greatest papers with code

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

google-research/google-research 14 Mar 2021

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Offline RL

Behavior Regularized Offline Reinforcement Learning

google-research/google-research 26 Nov 2019

In reinforcement learning (RL) research, it is common to assume access to direct online interactions with the environment.

Continuous Control Offline RL

RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning

deepmind/deepmind-research NeurIPS 2020

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Offline RL

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

deepmind/deepmind-research 24 Jun 2020

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Atari Games DQN Replay Dataset +1

Critic Regularized Regression

facebookresearch/ReAgent NeurIPS 2020

Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction.

Offline RL

Acme: A Research Framework for Distributed Reinforcement Learning

deepmind/acme 1 Jun 2020

Ultimately, we show that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

DQN Replay Dataset

Offline Reinforcement Learning with Implicit Q-Learning

rail-berkeley/rlkit 12 Oct 2021

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

Fine-tuning Offline RL +1

Decision Transformer: Reinforcement Learning via Sequence Modeling

kzl/decision-transformer NeurIPS 2021

In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.

Atari Games Language Modelling +2

D4RL: Datasets for Deep Data-Driven Reinforcement Learning

rail-berkeley/offline_rl 15 Apr 2020

In this work, we introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.

Offline RL

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

maximecb/gym-miniworld ICLR 2021

We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience.

Offline RL