1 code implementation • 5 Mar 2021 • Daniël Willemsen, Mario Coppola, Guido C. H. E. de Croon
MAMBPO uses a learned world model to improve sample efficiency compared to model-free Multi-Agent Soft Actor-Critic (MASAC).
reinforcement-learning Reinforcement Learning (RL)