Multi-agent Reinforcement Learning

432 papers with code • 3 benchmarks • 9 datasets

The target of Multi-agent Reinforcement Learning is to solve complex problems by integrating multiple agents that focus on different sub-tasks. In general, there are two types of multi-agent systems: independent and cooperative systems.

Source: Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray Reports

Subtasks


Most implemented papers

The StarCraft Multi-Agent Challenge

oxwhirl/pymarl 11 Feb 2019

In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap.

The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games

marlbenchmark/on-policy 2 Mar 2021

This is often due to the belief that PPO is significantly less sample efficient than off-policy methods in multi-agent systems.

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

oxwhirl/pymarl ICML 2018

At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.

Value-Decomposition Networks For Cooperative Multi-Agent Learning

facebookresearch/benchmarl 16 Jun 2017

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.

RLCard: A Toolkit for Reinforcement Learning in Card Games

datamllab/rlcard 10 Oct 2019

The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward.

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

facebookresearch/benchmarl 18 Nov 2020

Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function.

Learning with Opponent-Learning Awareness

alshedivat/lola 13 Sep 2017

We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

cts198859/deeprl_dist ICML 2017

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems.