Search Results for author: Neil Burch

Found 16 papers, 7 papers with code

Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning

1 code implementation2 Mar 2023 Marc Lanctot, John Schultz, Neil Burch, Max Olan Smith, Daniel Hennes, Thomas Anthony, Julien Perolat

Progress in fields of machine learning and adversarial planning has benefited significantly from benchmark domains, from checkers and the classic UCI data sets to Go and Diplomacy.

Decision Making Language Modelling

Solving Common-Payoff Games with Approximate Policy Iteration

2 code implementations11 Jan 2021 Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.

Multi-agent Reinforcement Learning reinforcement-learning +1

Human-Agent Cooperation in Bridge Bidding

no code implementations28 Nov 2020 Edward Lockhart, Neil Burch, Nolan Bard, Sebastian Borgeaud, Tom Eccles, Lucas Smaira, Ray Smith

We introduce a human-compatible reinforcement-learning approach to a cooperative game, making use of a third-party hand-coded human-compatible bot to generate initial training data and to perform initial evaluation.

Imitation Learning reinforcement-learning +1

Rethinking Formal Models of Partially Observable Multiagent Decision Making

no code implementations26 Jun 2019 Vojtěch Kovařík, Martin Schmid, Neil Burch, Michael Bowling, Viliam Lisý

A second issue is that while EFGs have recently seen significant algorithmic progress, their classical formalization is unsuitable for efficient presentation of the underlying ideas, such as those around decomposition.

counterfactual Decision Making +1

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

1 code implementation4 Nov 2018 Jakob N. Foerster, Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew Botvinick, Michael Bowling

We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment.

Multi-agent Reinforcement Learning Policy Gradient Methods +2

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

no code implementations9 Sep 2018 Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling

The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates.

counterfactual

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

1 code implementation6 Jan 2017 Matej Moravčík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael Bowling

Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence.

Game of Poker

AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games

no code implementations20 Dec 2016 Neil Burch, Martin Schmid, Matej Moravčík, Michael Bowling

Evaluating agent performance when outcomes are stochastic and agents use randomized strategies can be challenging when there is limited data available.

Predicting the Performance of IDA* using Conditional Distributions

no code implementations15 Jan 2014 Uzi Zahavi, Ariel Felner, Neil Burch, Robert C. Holte

In this paper we show that, in addition to requiring the heuristic to be consistent, their formulas predictions are accurate only at levels of the brute-force search tree where the heuristic values obey the unconditional distribution that they defined and then used in their formula.

Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions

no code implementations NeurIPS 2012 Neil Burch, Marc Lanctot, Duane Szafron, Richard G. Gibson

In this paper, we present a new MCCFR algorithm, Average Strategy Sampling (AS), that samples a subset of the player's actions according to the player's average strategy.

counterfactual

Bayes' Bluff: Opponent Modelling in Poker

1 code implementation4 Jul 2012 Finnegan Southey, Michael P. Bowling, Bryce Larson, Carmelo Piccione, Neil Burch, Darse Billings, Chris Rayner

We demonstrate methods for playing effective responses to the opponent, based on the posterior.

Cannot find the paper you are looking for? You can Submit a new open access paper.