Multi-agent Reinforcement Learning
432 papers with code • 3 benchmarks • 9 datasets
The target of Multi-agent Reinforcement Learning is to solve complex problems by integrating multiple agents that focus on different sub-tasks. In general, there are two types of multi-agent systems: independent and cooperative systems.
Source: Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray Reports
Libraries
Use these libraries to find Multi-agent Reinforcement Learning models and implementationsDatasets
Most implemented papers
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
We explore deep reinforcement learning methods for multi-agent domains.
The StarCraft Multi-Agent Challenge
In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap.
The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games
This is often due to the belief that PPO is significantly less sample efficient than off-policy methods in multi-agent systems.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.
Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
In this paper, we extend the theory of trust region learning to MARL.
Value-Decomposition Networks For Cooperative Multi-Agent Learning
We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.
RLCard: A Toolkit for Reinforcement Learning in Card Games
The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward.
Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function.
Learning with Opponent-Learning Awareness
We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems.