Offline RL
225 papers with code • 2 benchmarks • 6 datasets
Libraries
Use these libraries to find Offline RL models and implementationsDatasets
Latest papers with no code
Offline Trajectory Generalization for Offline Reinforcement Learning
Then we propose four strategies to use World Transformers to generate high-rewarded trajectory simulation by perturbing the offline data.
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
We evaluate our tracker on several high-fidelity environments with challenging situations, such as distraction and occlusion.
Generative Probabilistic Planning for Optimizing Supply Chain Networks
Supply chain networks in enterprises are typically composed of complex topological graphs involving various types of nodes and edges, accommodating numerous products with considerable demand and supply variability.
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains
In this paper, we investigate an offline reinforcement learning (RL) problem where datasets are collected from two domains.
Scaling Vision-and-Language Navigation With Offline RL
The study of vision-and-language navigation (VLN) has typically relied on expert trajectories, which may not always be available in real-world situations due to the significant effort required to collect them.
Uncertainty-aware Distributional Offline Reinforcement Learning
Offline reinforcement learning (RL) presents distinct challenges as it relies solely on observational data.
Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling
The LE is learned from a subset of user-item interaction data, thus reducing the need for large training data, and can synthesise user feedback for offline data by: (i) acting as a state model that produces high quality states that enrich the user representation, and (ii) functioning as a reward model to accurately capture nuanced user preferences on actions.
The Value of Reward Lookahead in Reinforcement Learning
In particular, we measure the ratio between the value of standard RL agents and that of agents with partial future-reward lookahead.
Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning
Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.
Towards Optimizing Human-Centric Objectives in AI-Assisted Decision-Making With Offline Reinforcement Learning
Across two experiments (N=316 and N=964), our results demonstrated that people interacting with policies optimized for accuracy achieve significantly better accuracy -- and even human-AI complementarity -- compared to those interacting with any other type of AI support.