no code implementations • 7 Mar 2023 • Wei Xi, Yongxin Zhang, Changnan Xiao, Xuefeng Huang, Shihong Deng, Haowei Liang, Jie Chen, Peng Sun
Deep Reinforcement Learning combined with Fictitious Play shows impressive results on many benchmark games, most of which are, however, single-stage.
no code implementations • 1 Jun 2021 • Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng
We find valued-based reinforcement learning methods with {\epsilon}-greedy mechanism are capable of enjoying three characteristics, Closed-form Diversity, Objective-invariant Exploration and Adaptive Trade-off, which help value-based methods avoid the policy collapse problem.
no code implementations • 9 May 2021 • Changnan Xiao, Haosen Shi, Jiajun Fan, Shihong Deng, Haiyan Yin
We study the problem of model-free reinforcement learning, which is often solved following the principle of Generalized Policy Iteration (GPI).
1 code implementation • 1 Jan 2021 • Dongyang Zhao, Yue Huang, Changnan Xiao, Yue Li, Shihong Deng
To address the problem brought by the environment, we propose a Meta Soft Hierarchical reinforcement learning framework (MeSH), in which each low-level sub-policy focuses on a specific sub-task respectively and high-level policy automatically learns to utilize low-level sub-policies through meta-gradients.
Hierarchical Reinforcement Learning Meta Reinforcement Learning +2