1 code implementation • 6 Dec 2023 • Junjie Sheng, Zixiao Huang, Chuyun Shen, Wenhao Li, Yun Hua, Bo Jin, Hongyuan Zha, Xiangfeng Wang
The formidable capacity for zero- or few-shot decision-making in language agents encourages us to pose a compelling question: Can language agents be alternatives to PPO agents in traditional sequential decision-making tasks?
2 code implementations • 9 Dec 2021 • Junjie Sheng, Shengliang Cai, Haochuan Cui, Wenhao Li, Yun Hua, Bo Jin, Wenli Zhou, Yiqiu Hu, Lei Zhu, Qian Peng, Hongyuan Zha, Xiangfeng Wang
A novel simulator called VMAgent is introduced to help RL researchers better explore new methods, especially for virtual machine scheduling.
no code implementations • 9 Feb 2021 • Wenhao Li, Xiangfeng Wang, Bo Jin, Junjie Sheng, Yun Hua, Hongyuan Zha
In order to improve the efficiency of cooperation and exploration, we propose a structured diversification emergence MARL framework named {\sc{Rochico}} based on reinforced organization control and hierarchical consensus learning.
no code implementations • 11 Feb 2020 • Yun Hua, Xiangfeng Wang, Bo Jin, Wenhao Li, Junchi Yan, Xiaofeng He, Hongyuan Zha
In spite of the success of existing meta reinforcement learning methods, they still have difficulty in learning a meta policy effectively for RL problems with sparse reward.