Search Results for author: Edward Lockhart

Found 14 papers, 7 papers with code

Fast computation of Nash Equilibria in Imperfect Information Games

no code implementations ICML 2020 Remi Munos, Julien Perolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls

We introduce and analyze a class of algorithms, called Mirror Ascent against an Improved Opponent (MAIO), for computing Nash equilibria in two-player zero-sum games, both in normal form and in sequential imperfect information form.

Solving Common-Payoff Games with Approximate Policy Iteration

2 code implementations11 Jan 2021 Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.

Decoder Multi-agent Reinforcement Learning +3

Human-Agent Cooperation in Bridge Bidding

no code implementations28 Nov 2020 Edward Lockhart, Neil Burch, Nolan Bard, Sebastian Borgeaud, Tom Eccles, Lucas Smaira, Ray Smith

We introduce a human-compatible reinforcement-learning approach to a cooperative game, making use of a third-party hand-coded human-compatible bot to generate initial training data and to perform initial evaluation.

Imitation Learning reinforcement-learning +1

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

18 code implementations19 Nov 2019 Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Atari Games Atari Games 100k +3

Consistent Jumpy Predictions for Videos and Scenes

no code implementations ICLR 2019 Ananya Kumar, S. M. Ali Eslami, Danilo Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan

These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive.

3D Scene Reconstruction Video Prediction

Deep reinforcement learning with relational inductive biases

no code implementations ICLR 2019 Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for augmenting model-free deep reinforcement learning agents with a mechanism for relational reasoning over structured representations, which improves performance, learning efficiency, generalization, and interpretability.

reinforcement-learning Reinforcement Learning +4

Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

no code implementations13 Mar 2019 Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls

In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents.

counterfactual

Consistent Generative Query Networks

no code implementations ICLR 2019 Ananya Kumar, S. M. Ali Eslami, Danilo J. Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan

These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive.

3D Scene Reconstruction Video Prediction

Relational Deep Reinforcement Learning

7 code implementations5 Jun 2018 Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning.

reinforcement-learning Reinforcement Learning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.