no code implementations • 21 Jan 2024 • Mao Hong, Zhiyue Zhang, Yue Wu, Yanxun Xu
Model-based offline reinforcement learning methods (RL) have achieved state-of-the-art performance in many decision-making problems thanks to their sample efficiency and generalizability.
no code implementations • 26 May 2023 • Mao Hong, Zhengling Qi, Yanxun Xu
To the best of our knowledge, this is the first work studying the policy gradient method for POMDPs under the offline setting.