TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
SMAC+	Def_Armored_parallel	VDN	Median Win Rate	5.0	# 4
SMAC+	Def_Armored_sequential	VDN	Median Win Rate	96.9	# 2
SMAC+	Def_Infantry_parallel	VDN	Median Win Rate	95.0	# 3
SMAC+	Def_Infantry_sequential	VDN	Median Win Rate	96.9	# 5
SMAC+	Def_Outnumbered_parallel	VDN	Median Win Rate	0.0	# 4
SMAC+	Def_Outnumbered_sequential	VDN	Median Win Rate	15.6	# 4
SMAC+	Off_Complicated_parallel	VDN	Median Win Rate	70.0	# 2
SMAC+	Off_Distant_parallel	VDN	Median Win Rate	85.0	# 2
SMAC+	Off_Hard_parallel	VDN	Median Win Rate	15.0	# 2
SMAC+	Off_Near_parallel	VDN	Median Win Rate	90.0	# 3
SMAC+	Off_Superhard_parallel	VDN	Median Win Rate	0.0	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-off-superhard-parallel)](https://paperswithcode.com/sota/smac-on-smac-off-superhard-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-armored-sequential)](https://paperswithcode.com/sota/smac-on-smac-def-armored-sequential?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-off-complicated-parallel)](https://paperswithcode.com/sota/smac-on-smac-off-complicated-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-off-distant-parallel)](https://paperswithcode.com/sota/smac-on-smac-off-distant-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-off-hard-parallel)](https://paperswithcode.com/sota/smac-on-smac-off-hard-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-infantry-parallel)](https://paperswithcode.com/sota/smac-on-smac-def-infantry-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-off-near-parallel)](https://paperswithcode.com/sota/smac-on-smac-off-near-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-armored-parallel)](https://paperswithcode.com/sota/smac-on-smac-def-armored-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-outnumbered-parallel)](https://paperswithcode.com/sota/smac-on-smac-def-outnumbered-parallel?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-outnumbered-sequential)](https://paperswithcode.com/sota/smac-on-smac-def-outnumbered-sequential?p=value-decomposition-networks-for-cooperative)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/value-decomposition-networks-for-cooperative/smac-on-smac-def-infantry-sequential)](https://paperswithcode.com/sota/smac-on-smac-def-infantry-sequential?p=value-decomposition-networks-for-cooperative)`

Value-Decomposition Networks For Cooperative Multi-Agent Learning

16 Jun 2017 · Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel ·

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent" problem, which arises due to partial observability. We address these problems by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions. We perform an experimental evaluation across a range of partially-observable multi-agent domains and show that learning such value-decompositions leads to superior results, in particular when combined with weight sharing, role information and information channels.

PDF Abstract

Code

Add Remove Mark official

facebookresearch/benchmarl

↳ Quickstart in

Colab

168

hhhusiyi-monash/UPDeT

124

TonghanWang/NDQ

TonghanWang/DOP

puyuan1996/MARL

See all 8 implementations

Tasks

Add Remove

Multi-agent Reinforcement Learning

reinforcement-learning

Reinforcement Learning (RL)

SMAC+

Datasets

SMAC-Exp

Def_Armored_sequential

Def_Infantry_sequential

Def_Infantry_parallel

Def_Outnumbered_sequential

Def_Armored_parallel

Def_Outnumbered_parallel

Off_Hard_parallel

Off_Superhard_parallel

Off_Near_parallel

Off_Complicated_parallel

Off_Distant_parallel

Results from the Paper

Edit

Ranked #1 on SMAC+ on Off_Superhard_parallel

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
SMAC+	Def_Armored_parallel	VDN	Median Win Rate	5.0	# 4	Compare
SMAC+	Def_Armored_sequential	VDN	Median Win Rate	96.9	# 2	Compare
SMAC+	Def_Infantry_parallel	VDN	Median Win Rate	95.0	# 3	Compare
SMAC+	Def_Infantry_sequential	VDN	Median Win Rate	96.9	# 5	Compare
SMAC+	Def_Outnumbered_parallel	VDN	Median Win Rate	0.0	# 4	Compare
SMAC+	Def_Outnumbered_sequential	VDN	Median Win Rate	15.6	# 4	Compare
SMAC+	Off_Complicated_parallel	VDN	Median Win Rate	70.0	# 2	Compare
SMAC+	Off_Distant_parallel	VDN	Median Win Rate	85.0	# 2	Compare
SMAC+	Off_Hard_parallel	VDN	Median Win Rate	15.0	# 2	Compare
SMAC+	Off_Near_parallel	VDN	Median Win Rate	90.0	# 3	Compare
SMAC+	Off_Superhard_parallel	VDN	Median Win Rate	0.0	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Value-Decomposition Networks For Cooperative Multi-Agent Learning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove