Search Results for author: Xiyue Peng

Found 1 papers, 0 papers with code

Adversarially Trained Actor Critic for offline CMDPs

no code implementations • 1 Jan 2024 • Honghao Wei, Xiyue Peng, Xin Liu, Arnob Ghosh

Theoretically, we demonstrate that when the actor employs a no-regret optimization oracle, SATAC achieves two guarantees: (i) For the first time in the offline RL setting, we establish that SATAC can produce a policy that outperforms the behavior policy while maintaining the same level of safety, which is critical to designing an algorithm for offline RL.

Continuous Control Offline RL +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.