Search Results for author: Yihang Yao

Found 4 papers, 3 papers with code

Learning from Sparse Offline Datasets via Conservative Density Estimation

1 code implementation • 16 Jan 2024 • Zhepeng Cen, Zuxin Liu, Zitong Wang, Yihang Yao, Henry Lam, Ding Zhao

Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment.

D4RL Density Estimation +2

Paper
Code

Gradient Shaping for Multi-Constraint Safe Reinforcement Learning

no code implementations • 23 Dec 2023 • Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, Ding Zhao

Leveraging insights from this framework and recognizing the significance of \textit{redundant} and \textit{conflicting} constraint conditions, we introduce the Gradient Shaping (GradS) method for general Lagrangian-based safe RL algorithms to improve the training efficiency in terms of both reward and constraint satisfaction.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Datasets and Benchmarks for Offline Safe Reinforcement Learning

3 code implementations • 15 Jun 2023 • Zuxin Liu, Zijian Guo, Haohong Lin, Yihang Yao, Jiacheng Zhu, Zhepeng Cen, Hanjiang Hu, Wenhao Yu, Tingnan Zhang, Jie Tan, Ding Zhao

This paper presents a comprehensive benchmarking suite tailored to offline safe reinforcement learning (RL) challenges, aiming to foster progress in the development and evaluation of safe learning algorithms in both the training and deployment phases.

Autonomous Driving Benchmarking +4

138

Paper
Code

Constrained Decision Transformer for Offline Safe Reinforcement Learning

1 code implementation • 14 Feb 2023 • Zuxin Liu, Zijian Guo, Yihang Yao, Zhepeng Cen, Wenhao Yu, Tingnan Zhang, Ding Zhao

Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment.

reinforcement-learning Reinforcement Learning (RL) +1

138

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.