SMAC
38 papers with code • 11 benchmarks • 1 datasets
The StarCraft Multi-Agent Challenge (SMAC) is a benchmark that provides elements of partial observability, challenging dynamics, and high-dimensional observation spaces. SMAC is built using the StarCraft II game engine, creating a testbed for research in cooperative MARL where each game unit is an independent RL agent.
Libraries
Use these libraries to find SMAC models and implementationsLatest papers with no code
QFree: A Universal Value Function Factorization for Multi-Agent Reinforcement Learning
Once a joint policy is obtained, it is critical to design a value function factorization method to extract optimal decentralized policies for the agents, which needs to satisfy the individual-global-max (IGM) principle.
MaskMA: Towards Zero-Shot Multi-Agent Decision Making with Mask-Based Collaborative Learning
Building a single generalist agent with strong zero-shot capability has recently sparked significant advancements.
Privacy-Engineered Value Decomposition Networks for Cooperative Multi-Agent Reinforcement Learning
Accordingly, we propose Privacy-Engineered Value Decomposition Networks (PE-VDN), a Co-MARL algorithm that models multi-agent coordination while provably safeguarding the confidentiality of the agents' environment interaction data.
Research on Multi-Agent Communication and Collaborative Decision-Making Based on Deep Reinforcement Learning
In order to alleviate the non-stationarity of the multi-agent environment, a multi-agent communication mechanism based on weight scheduling and attention module is introduced.
MABL: Bi-Level Latent-Variable World Model for Sample-Efficient Multi-Agent Reinforcement Learning
Unlike existing models, MABL is capable of encoding essential global information into the latent states during training while guaranteeing the decentralized execution of learned policies.
GHQ: Grouped Hybrid Q Learning for Heterogeneous Cooperative Multi-agent Reinforcement Learning
We firstly define and describe the heterogeneous problems in SMAC.
Decision-making with Speculative Opponent Models
To address this issue, we introduce Distributional Opponent-aided Multi-agent Actor-Critic (DOMAC), the first speculative opponent modelling algorithm that relies solely on local information (i. e., the controlled agent's observations, actions, and rewards).
Contextual Transformer for Offline Meta Reinforcement Learning
Firstly, we propose prompt tuning for offline RL, where a context vector sequence is concatenated with the input to guide the conditional policy generation.
PTDE: Personalized Training with Distilled Execution for Multi-Agent Reinforcement Learning
Furthermore, we introduce a novel paradigm named Personalized Training with Distilled Execution (PTDE), wherein agent-personalized global information is distilled into the agent's local information.
Maximum Correntropy Value Decomposition for Multi-agent Deep Reinforcemen Learning
In this paper, we first demonstrate the flaw of Weighted QMIX using an ordinary One-Step Matrix Game (OMG), that no matter how the weight is chosen, Weighted QMIX struggles to deal with non-monotonic value decomposition problems with a large variance of reward distributions.