no code implementations • 4 Dec 2023 • Lukas Schäfer, Logan Jones, Anssi Kanervisto, Yuhan Cao, Tabish Rashid, Raluca Georgescu, Dave Bignell, Siddhartha Sen, Andrea Treviño Gavito, Sam Devlin
Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community.
1 code implementation • 25 Jan 2023 • Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin
This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.
1 code implementation • NeurIPS 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson
Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.
no code implementations • 22 Mar 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson
Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.
1 code implementation • 22 Jan 2021 • Tabish Rashid, Cheng Zhang, Kamil Ciosek
We show the benefits of using information gain as compared to the confidence interval criterion of ResponseGraphUCB (Rowland et al. 2019), and provide theoretical results justifying our method.
4 code implementations • NeurIPS 2020 • Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson
We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.
1 code implementation • 19 Mar 2020 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson
At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted.
Ranked #6 on SMAC on SMAC 6h_vs_8z
3 code implementations • NeurIPS 2021 • Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
1 code implementation • ICLR 2020 • Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson
We show that this scheme is provably efficient in the tabular setting and extend it to the deep RL setting.
4 code implementations • NeurIPS 2019 • Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson
We specifically focus on QMIX [40], the current state-of-the-art in this domain.
no code implementations • 5 Jun 2019 • Wendelin Böhmer, Tabish Rashid, Shimon Whiteson
This paper investigates the use of intrinsic reward to guide exploration in multi-agent reinforcement learning.
20 code implementations • 11 Feb 2019 • Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson
In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap.
Ranked #6 on SMAC on SMAC 6h_vs_8z
16 code implementations • ICML 2018 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson
At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.
Ranked #1 on SMAC+ on Off_Near_parallel
Multi-agent Reinforcement Learning reinforcement-learning +4