34 papers with code • 2 benchmarks • 3 datasets
Starcraft II is a RTS game; the task is to train an agent to play the game.
( Image credit: The StarCraft Multi-Agent Challenge )
At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning.
Proximal Policy Optimization (PPO) is a popular on-policy reinforcement learning algorithm but is significantly less utilized than off-policy learning algorithms in multi-agent settings.
Reinforcement learning (RL) is an area of research that has blossomed tremendously in recent years and has shown remarkable potential for artificial intelligence based opponents in computer games.
We show that with orders of less computation scale, a faithful reimplementation of AlphaStar's methods can not succeed and the proposed techniques are necessary to ensure TStarBot-X's competitive performance.
This poses non-trivial difficulties for researchers or engineers and prevents the application of MARL to a broader range of real-world problems.