1 code implementation • 31 May 2024 • Hao Hu, Yiqin Yang, Jianing Ye, Chengjie WU, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang
In this paper, we tackle the fundamental dilemma of offline-to-online fine-tuning: if the agent remains pessimistic, it may fail to learn a better policy, while if it becomes optimistic directly, performance may suffer from a sudden drop.
1 code implementation • 20 May 2024 • Qihan Liu, Jianing Ye, Xiaoteng Ma, Jun Yang, Bin Liang, Chongjie Zhang
Extensive experiments on the SMAC benchmark demonstrate that MAZero outperforms model-free approaches in terms of sample efficiency and provides comparable or better performance than existing model-based methods in terms of both sample and computational efficiency.
Computational Efficiency Model-based Reinforcement Learning +2
no code implementations • 12 Jul 2022 • Jianing Ye, Chenghao Li, Jianhao Wang, Chongjie Zhang
Decentralized execution is one core demand in cooperative multi-agent reinforcement learning (MARL).
Multi-agent Reinforcement Learning Policy Gradient Methods +2
1 code implementation • 11 Mar 2021 • Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang
Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning.
no code implementations • 28 Sep 2020 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang
Value decomposition is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings.
no code implementations • NeurIPS 2021 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang
Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions.