Search Results for author: Jianzhun Shao

Found 7 papers, 2 papers with code

Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning

1 code implementation NeurIPS 2023 Jianzhun Shao, Yun Qu, Chen Chen, Hongchang Zhang, Xiangyang Ji

Offline multi-agent reinforcement learning is challenging due to the coupling effect of both distribution shift issue common in offline setting and the high dimension issue common in multi-agent setting, making the action out-of-distribution (OOD) and value overestimation phenomenon excessively severe.

counterfactual Multi-agent Reinforcement Learning +3

Wasserstein Unsupervised Reinforcement Learning

no code implementations15 Oct 2021 Shuncheng He, Yuhang Jiang, Hongchang Zhang, Jianzhun Shao, Xiangyang Ji

These pre-trained policies can accelerate learning when endowed with external reward, and can also be used as primitive options in hierarchical reinforcement learning.

Hierarchical Reinforcement Learning reinforcement-learning +2

Reducing Conservativeness Oriented Offline Reinforcement Learning

no code implementations27 Feb 2021 Hongchang Zhang, Jianzhun Shao, Yuhang Jiang, Shuncheng He, Xiangyang Ji

In offline reinforcement learning, a policy learns to maximize cumulative rewards with a fixed collection of data.

D4RL reinforcement-learning +1

Credit Assignment with Meta-Policy Gradient for Multi-Agent Reinforcement Learning

no code implementations24 Feb 2021 Jianzhun Shao, Hongchang Zhang, Yuhang Jiang, Shuncheng He, Xiangyang Ji

Reward decomposition is a critical problem in centralized training with decentralized execution~(CTDE) paradigm for multi-agent reinforcement learning.

Meta-Learning Multi-agent Reinforcement Learning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.