87 papers with code • 0 benchmarks • 6 datasets
Starcraft I is a RTS game; the task is to train an agent to play the game.
( Image credit: Macro Action Selection with Deep Reinforcement Learning in StarCraft )
These leaderboards are used to track progress in Starcraft
Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain.
At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.
A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible.
Proximal Policy Optimization (PPO) is a popular on-policy reinforcement learning algorithm but is significantly less utilized than off-policy learning algorithms in multi-agent settings.
Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems.
Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.
Both TStarBot1 and TStarBot2 are able to defeat the built-in AI agents from level 1 to level 10 in a full game (1v1 Zerg-vs-Zerg game on the AbyssalReef map), noting that level 8, level 9, and level 10 are cheating agents with unfair advantages such as full vision on the whole map and resource harvest boosting.