D4RL
69 papers with code • 1 benchmarks • 1 datasets
Libraries
Use these libraries to find D4RL models and implementationsMost implemented papers
Mildly Conservative Q-Learning for Offline Reinforcement Learning
The distribution shift between the learned policy and the behavior policy makes it necessary for the value function to stay conservative such that out-of-distribution (OOD) actions will not be severely overestimated.
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
In our approach, we learn an action-value function and we add a term maximizing action-values into the training loss of the conditional diffusion model, which results in a loss that seeks optimal actions that are near the behavior policy.
Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief
To make practical, we further devise an offline RL algorithm to approximately find the solution.
CORL: Research-oriented Deep Offline Reinforcement Learning Library
CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms.
Extreme Q-Learning: MaxEnt RL without Entropy
Using EVT, we derive our \emph{Extreme Q-Learning} framework and consequently online and, for the first time, offline MaxEnt Q-learning algorithms, that do not explicitly require access to a policy or its entropy.
Anti-Exploration by Random Network Distillation
Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning.
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
This gives a deeper understanding of why the in-sample learning paradigm works, i. e., it applies implicit value regularization to the policy.
Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning
The main challenge for this setting is that the intermediate guidance during the diffusion sampling procedure, which is jointly defined by the sampling distribution and the energy function, is unknown and is hard to estimate.
Datasets and Benchmarks for Offline Safe Reinforcement Learning
This paper presents a comprehensive benchmarking suite tailored to offline safe reinforcement learning (RL) challenges, aiming to foster progress in the development and evaluation of safe learning algorithms in both the training and deployment phases.
d3rlpy: An Offline Deep Reinforcement Learning Library
In this paper, we introduce d3rlpy, an open-sourced offline deep reinforcement learning (RL) library for Python.