Offline RL

225 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Offline RL models and implementations
14 papers
35
7 papers
387
4 papers
2,523
See all 10 libraries.

Latest papers with no code

Offline Trajectory Generalization for Offline Reinforcement Learning

no code yet • 16 Apr 2024

Then we propose four strategies to use World Transformers to generate high-rewarded trajectory simulation by perturbing the offline data.

Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL

no code yet • 15 Apr 2024

We evaluate our tracker on several high-fidelity environments with challenging situations, such as distraction and occlusion.

Generative Probabilistic Planning for Optimizing Supply Chain Networks

no code yet • 11 Apr 2024

Supply chain networks in enterprises are typically composed of complex topological graphs involving various types of nodes and edges, accommodating numerous products with considerable demand and supply variability.

Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains

no code yet • 11 Apr 2024

In this paper, we investigate an offline reinforcement learning (RL) problem where datasets are collected from two domains.

Scaling Vision-and-Language Navigation With Offline RL

no code yet • 27 Mar 2024

The study of vision-and-language navigation (VLN) has typically relied on expert trajectories, which may not always be available in real-world situations due to the significant effort required to collect them.

Uncertainty-aware Distributional Offline Reinforcement Learning

no code yet • 26 Mar 2024

Offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data.

Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling

no code yet • 25 Mar 2024

The LE is learned from a subset of user-item interaction data, thus reducing the need for large training data, and can synthesise user feedback for offline data by: (i) acting as a state model that produces high quality states that enrich the user representation, and (ii) functioning as a reward model to accurately capture nuanced user preferences on actions.

The Value of Reward Lookahead in Reinforcement Learning

no code yet • 18 Mar 2024

In particular, we measure the ratio between the value of standard RL agents and that of agents with partial future-reward lookahead.

Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning

no code yet • 14 Mar 2024

Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.

Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning

no code yet • 9 Mar 2024

Across two experiments (N=316 and N=964), our results demonstrated that people interacting with policies optimized for accuracy achieve significantly better accuracy -- and even human-AI complementarity -- compared to those interacting with any other type of AI support.