Combinatorial Multi-armed Bandits for Real-Time Strategy Games

13 Oct 2017 · Santiago Ontañón ·

Games with large branching factors pose a significant challenge for game tree search algorithms. In this paper, we address this problem with a sampling strategy for Monte Carlo Tree Search (MCTS) algorithms called {\em na\"{i}ve sampling}, based on a variant of the Multi-armed Bandit problem called {\em Combinatorial Multi-armed Bandits} (CMAB). We analyze the theoretical properties of several variants of {\em na\"{i}ve sampling}, and empirically compare it against the other existing strategies in the literature for CMABs. We then evaluate these strategies in the context of real-time strategy (RTS) games, a genre of computer games characterized by their very large branching factors. Our results show that as the branching factor grows, {\em na\"{i}ve sampling} outperforms the other sampling strategies.

PDF Abstract