DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

16 Feb 2021  ยท  Wei-Fang Sun, Cheng-Kuang Lee, Chun-Yi Lee ยท

In fully cooperative multi-agent reinforcement learning (MARL) settings, the environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of the other agents. To address the above issues, we integrate distributional RL and value function factorization methods by proposing a Distributional Value Function Factorization (DFAC) framework to generalize expected value function factorization methods to their DFAC variants. DFAC extends the individual utility functions from deterministic variables to random variables, and models the quantile function of the total return as a quantile mixture. To validate DFAC, we demonstrate DFAC's ability to factorize a simple two-step matrix game with stochastic rewards and perform experiments on all Super Hard tasks of StarCraft Multi-Agent Challenge, showing that DFAC is able to outperform expected value function factorization baselines.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
SMAC+ Def_Armored_parallel DIQL Median Win Rate 0.0 # 6
SMAC+ Def_Armored_parallel DMIX Median Win Rate 90.0 # 1
SMAC+ Def_Armored_parallel DDN Median Win Rate 0.0 # 6
SMAC+ Def_Armored_sequential DMIX Median Win Rate 81.3 # 5
SMAC+ Def_Armored_sequential DIQL Median Win Rate 53.1 # 7
SMAC+ Def_Armored_sequential DDN Median Win Rate 71.9 # 6
SMAC+ Def_Infantry_parallel DDN Median Win Rate 20.0 # 10
SMAC+ Def_Infantry_parallel DMIX Median Win Rate 90.0 # 5
SMAC+ Def_Infantry_sequential DDN Median Win Rate 90.6 # 9
SMAC+ Def_Infantry_sequential DMIX Median Win Rate 100 # 1
SMAC+ Def_Infantry_sequential DIQL Median Win Rate 93.8 # 7
SMAC+ Def_Outnumbered_parallel DDN Median Win Rate 0.0 # 4
SMAC+ Def_Outnumbered_parallel DMIX Median Win Rate 5.0 # 3
SMAC+ Def_Outnumbered_parallel DIQL Median Win Rate 0.0 # 4
SMAC+ Def_Outnumbered_sequential DDN Median Win Rate 0.0 # 5
SMAC+ Def_Outnumbered_sequential DIQL Median Win Rate 0.0 # 5
SMAC+ Def_Outnumbered_sequential DMIX Median Win Rate 0.0 # 5
SMAC+ Off_Complicated_parallel DIQL Median Win Rate 0.0 # 4
SMAC+ Off_Complicated_parallel DMIX Median Win Rate 0.0 # 4
SMAC+ Off_Complicated_parallel DDN Median Win Rate 0.0 # 4
SMAC+ Off_Distant_parallel DIQL Median Win Rate 0.0 # 3
SMAC+ Off_Distant_parallel DDN Median Win Rate 0.0 # 3
SMAC+ Off_Distant_parallel DMIX Median Win Rate 0.0 # 3
SMAC+ Off_Hard_parallel DDN Median Win Rate 0.0 # 3
SMAC+ Off_Hard_parallel DMIX Median Win Rate 0.0 # 3
SMAC+ Off_Hard_parallel DIQL Median Win Rate 0.0 # 3
SMAC+ Off_Near_parallel DMIX Median Win Rate 0.0 # 6
SMAC+ Off_Near_parallel DDN Median Win Rate 0.0 # 6
SMAC+ Off_Near_parallel DIQL Median Win Rate 0.0 # 6
SMAC+ Off_Superhard_parallel DDN Median Win Rate 0.0 # 1
SMAC+ Off_Superhard_parallel DMIX Median Win Rate 0.0 # 1
SMAC+ Off_Superhard_parallel DIQL Median Win Rate 0.0 # 1
SMAC SMAC 27m_vs_30m IQL Median Win Rate 2.27 # 10
Average Score 14.01 # 8
SMAC SMAC 27m_vs_30m DDN Median Win Rate 91.48 # 1
Average Score 19.71 # 1
SMAC SMAC 27m_vs_30m DMIX Median Win Rate 85.45 # 3
Average Score 19.43 # 3
SMAC SMAC 27m_vs_30m DIQL Median Win Rate 6.02 # 9
Average Score 14.45 # 7
SMAC SMAC 27m_vs_30m QMIX Median Win Rate 84.77 # 4
Average Score 19.41 # 4
SMAC SMAC 27m_vs_30m VDN Median Win Rate 63.12 # 6
Average Score 18.45 # 6
SMAC SMAC 3s5z_vs_3s6z QMIX Median Win Rate 67.22 # 7
Average Score 20.16 # 4
SMAC SMAC 3s5z_vs_3s6z VDN Median Win Rate 89.2 # 5
Average Score 19.75 # 5
SMAC SMAC 3s5z_vs_3s6z IQL Median Win Rate 29.83 # 9
Average Score 16.54 # 8
SMAC SMAC 3s5z_vs_3s6z DMIX Median Win Rate 91.08 # 3
Average Score 19.7 # 6
SMAC SMAC 3s5z_vs_3s6z DDN Median Win Rate 94.03 # 2
Average Score 20.94 # 1
SMAC SMAC 3s5z_vs_3s6z DIQL Median Win Rate 62.22 # 8
Average Score 17.52 # 7
SMAC SMAC 6h_vs_8z QMIX Median Win Rate 12.78 # 5
Average Score 14.37 # 7
SMAC SMAC 6h_vs_8z IQL Median Win Rate 0 # 8
Average Score 13.78 # 8
SMAC SMAC 6h_vs_8z DDN Median Win Rate 83.92 # 2
Average Score 19.4 # 1
SMAC SMAC 6h_vs_8z DMIX Median Win Rate 49.43 # 3
Average Score 17.14 # 3
SMAC SMAC 6h_vs_8z DIQL Median Win Rate 0.00 # 8
Average Score 14.94 # 6
SMAC SMAC 6h_vs_8z VDN Median Win Rate 0 # 8
Average Score 15.41 # 5
SMAC SMAC corridor VDN Median Win Rate 85.34 # 5
Average Score 19.47 # 4
SMAC SMAC corridor DDN Median Win Rate 95.4 # 2
Average Score 20 # 1
SMAC SMAC corridor DMIX Median Win Rate 90.45 # 4
Average Score 19.66 # 3
SMAC SMAC corridor DIQL Median Win Rate 91.62 # 3
Average Score 19.68 # 2
SMAC SMAC corridor QMIX Median Win Rate 37.61 # 9
Average Score 15.07 # 8
SMAC SMAC corridor IQL Median Win Rate 84.87 # 6
Average Score 19.42 # 5
SMAC SMAC MMM2 DIQL Median Win Rate 85.23 # 8
Average Score 19.21 # 7
SMAC SMAC MMM2 VDN Median Win Rate 89.2 # 7
Average Score 19.36 # 6
SMAC SMAC MMM2 IQL Median Win Rate 68.92 # 11
Average Score 17.5 # 8
SMAC SMAC MMM2 DMIX Median Win Rate 95.11 # 5
Average Score 19.87 # 3
SMAC SMAC MMM2 DDN Median Win Rate 97.22 # 2
Average Score 20.9 # 1
SMAC SMAC MMM2 QMIX Median Win Rate 92.44 # 6
Average Score 19.42 # 5

Methods