1 code implementation • 6 Jan 2024 • Ava Pettet, Yunuo Zhang, Baiting Luo, Kyle Wray, Hendrik Baier, Aron Laszka, Abhishek Dubey, Ayan Mukhopadhyay
In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment.
no code implementations • 31 May 2022 • Daniel Hernandez, Hendrik Baier, Michael Kaisers
Finding a best response policy is a central objective in game theory and multi-agent learning, with modern population-based training approaches employing reinforcement learning algorithms as best-response oracles to improve play against candidate opponents (typically previously learnt policies).
1 code implementation • 27 Jan 2022 • Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A. Oliehoek
To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation, based on a statistic that measures the accuracy of the approximate simulator.
1 code implementation • 3 Aug 2018 • Timothy Atkinson, Hendrik Baier, Tara Copplestone, Sam Devlin, Jerry Swan
In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games.