MADiff: Offline Multi-agent Learning with Diffusion Models

Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning, where the diffusion planner learn to generate desired trajectories during online evaluations. However, despite the effectiveness in single-agent learning, it remains unclear how DMs can operate in multi-agent problems, where agents can hardly complete teamwork without good coordination by independently modeling each agent's trajectories. In this paper, we propose MADiff, a novel generative multi-agent learning framework to tackle this problem. MADiff is realized with an attention-based diffusion model to model the complex coordination among behaviors of multiple agents. To the best of our knowledge, MADiff is the first diffusion-based multi-agent learning framework, which behaves as both a decentralized policy and a centralized controller. During decentralized executions, MADiff simultaneously performs teammate modeling, and the centralized controller can also be applied in multi-agent trajectory predictions. Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks, which emphasizes the effectiveness of MADiff in modeling complex multi-agent interactions. Our code is available at

Results in Papers With Code
(↓ scroll down to see all results)