no code implementations • 25 Feb 2025 • Lei LI, Sen Jia, Jianhao Wang, Zhaochong An, Jiaang Li, Jenq-Neng Hwang, Serge Belongie
Advancements in Multimodal Large Language Models (MLLMs) have improved human motion understanding.
1 code implementation • 31 May 2023 • Jianhao Wang, Jin Zhang, Haozhe Jiang, Junyu Zhang, LiWei Wang, Chongjie Zhang
We find a return-based uncertainty quantification for IDAQ that performs effectively.
no code implementations • 12 Jul 2022 • Jianing Ye, Chenghao Li, Jianhao Wang, Chongjie Zhang
Decentralized execution is one core demand in cooperative multi-agent reinforcement learning (MARL).
Multi-agent Reinforcement Learning
Policy Gradient Methods
+2
1 code implementation • 16 Mar 2022 • Xi Chen, Ali Ghadirzadeh, Tianhe Yu, Yuan Gao, Jianhao Wang, Wenzhe Li, Bin Liang, Chelsea Finn, Chongjie Zhang
Offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions.
1 code implementation • 7 Dec 2021 • Qianlan Yang, Weijun Dong, Zhizhou Ren, Jianhao Wang, Tonghan Wang, Chongjie Zhang
However, one critical challenge in this paradigm is the complexity of greedy action selection with respect to the factorized values.
2 code implementations • NeurIPS 2021 • Lulu Zheng, Jiarui Chen, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang
Efficient exploration in deep cooperative multi-agent reinforcement learning (MARL) still remains challenging in complex coordination problems.
1 code implementation • NeurIPS 2021 • Jianhao Wang, Wenzhe Li, Haozhe Jiang, Guangxiang Zhu, Siyuan Li, Chongjie Zhang
These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset.
no code implementations • 26 Sep 2021 • Jiahan Cao, Lei Yuan, Jianhao Wang, Shaowei Zhang, Chongjie Zhang, Yang Yu, De-Chuan Zhan
During long-time observations, agents can build \textit{awareness} for teammates to alleviate the problem of partial observability.
1 code implementation • ICLR 2022 • Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang
Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.
no code implementations • 1 Jan 2021 • Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang
Deep reinforcement learning algorithms generally require large amounts of data to solve a single task.
no code implementations • ICLR 2021 • Siyuan Li, Lulu Zheng, Jianhao Wang, Chongjie Zhang
In goal-conditioned Hierarchical Reinforcement Learning (HRL), a high-level policy periodically sets subgoals for a low-level policy, and the low-level policy is trained to reach those subgoals.
no code implementations • 28 Sep 2020 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang
Value decomposition is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings.
6 code implementations • ICLR 2021 • Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang
This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.
1 code implementation • 15 Jun 2020 • Jin Zhang, Jianhao Wang, Hao Hu, Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang
Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks and achieves fast adaptation to new tasks.
no code implementations • NeurIPS 2021 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang
Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions.
1 code implementation • ICLR 2020 • Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang
We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents.
1 code implementation • ICLR 2020 • Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang
Recently, value function factorization learning emerges as a promising way to address these challenges in collaborative multi-agent systems.
no code implementations • ICLR 2019 • Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Chongjie Zhang
Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability.
1 code implementation • 16 Apr 2019 • Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang
We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability.