Go is an abstract strategy board game for two players, in which the aim is to surround more territory than the opponent. The task is to train an agent to play the game and be superior to other players.
The transposition table contains the results of the inferences while the search tree contains the statistics of Monte Carlo Tree Search.
The architecture of the neural networks used in Deep Reinforcement Learning programs such as Alpha Zero or Polygames has been shown to have a great impact on the performances of the resulting playing engines.
This work applies natural language modeling to generate plausible strategic moves in the ancient game of Go.
A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources.
We derive model-free RL algorithms based on $\kappa$-PI and $\kappa$-VI in which the surrogate problem can be solved by any discrete or continuous action RL method, such as DQN and TRPO.
We introduce three structures and training methods that aim to create a strong Go player: non-rectangular convolutions, which will better learn the shapes on the board, supervised learning, training on a data set of 53, 000 professional games, and reinforcement learning, training on games played between different versions of the network.
We propose MoET, a more expressive, yet still interpretable model based on Mixture of Experts, consisting of a gating function that partitions the state space, and multiple decision tree experts that specialize on different partitions.
The evaluation function for imperfect information games is always hard to define but owns a significant impact on the playing strength of a program.
Reinforcement learning has seen great advancements in the past five years.