Search Results for author: Thore Graepel

Found 40 papers, 19 papers with code

Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria

no code implementations5 Jan 2022 Kavya Kopparapu, Edgar A. Duéñez-Guzmán, Jayd Matyas, Alexander Sasha Vezhnevets, John P. Agapiou, Kevin R. McKee, Richard Everett, Janusz Marecki, Joel Z. Leibo, Thore Graepel

A key challenge in the study of multiagent cooperation is the need for individual agents not only to cooperate effectively, but to decide with whom to cooperate.

reinforcement-learning

A PAC-Bayesian Analysis of Distance-Based Classifiers: Why Nearest-Neighbour works!

no code implementations28 Sep 2021 Thore Graepel, Ralf Herbrich

Abstract We present PAC-Bayesian bounds for the generalisation error of the K-nearest-neighbour classifier (K-NN).

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

no code implementations14 Jul 2021 Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).

Multi-agent Reinforcement Learning reinforcement-learning

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

1 code implementation17 Jun 2021 Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel

Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting.

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Decision Making Imitation Learning +2

EigenGame Unloaded: When playing games is better than optimizing

1 code implementation ICLR 2022 Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel

We build on the recently proposed EigenGame that views eigendecomposition as a competitive game.

Open Problems in Cooperative AI

no code implementations15 Dec 2020 Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, Thore Graepel

We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.

Natural Language Processing

EigenGame: PCA as a Nash Equilibrium

2 code implementations ICLR 2021 Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel

We present a novel view on principal component analysis (PCA) as a competitive game in which each approximate eigenvector is controlled by a player whose goal is to maximize their own utility function.

Biases for Emergent Communication in Multi-agent Reinforcement Learning

no code implementations NeurIPS 2019 Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, Thore Graepel

We study the problem of emergent communication, in which language arises because speakers and listeners must communicate information in order to solve tasks.

Multi-agent Reinforcement Learning reinforcement-learning

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

15 code implementations19 Nov 2019 Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Atari Games Game of Chess +2

Multiagent Reinforcement Learning in Games with an Iterated Dominance Solution

no code implementations25 Sep 2019 Yoram Bachrach, Tor Lattimore, Marta Garnelo, Julien Perolat, David Balduzzi, Thomas Anthony, Satinder Singh, Thore Graepel

We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.

reinforcement-learning

A Neural Architecture for Designing Truthful and Efficient Auctions

no code implementations11 Jul 2019 Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel, Yoram Bachrach

Auctions are protocols to allocate goods to buyers who have preferences over them, and collect payments in return.

Differentiable Game Mechanics

1 code implementation13 May 2019 Alistair Letcher, David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games.

Emergent Coordination Through Competition

no code implementations ICLR 2019 Si-Qi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel

We study the emergence of cooperative behaviors in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics.

Continuous Control reinforcement-learning

Open-ended Learning in Symmetric Zero-sum Games

no code implementations23 Jan 2019 David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'.

Malthusian Reinforcement Learning

no code implementations17 Dec 2018 Joel Z. Leibo, Julien Perolat, Edward Hughes, Steven Wheelwright, Adam H. Marblestone, Edgar Duéñez-Guzmán, Peter Sunehag, Iain Dunning, Thore Graepel

Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation.

Multi-agent Reinforcement Learning reinforcement-learning

Relational Forward Models for Multi-Agent Learning

no code implementations ICLR 2019 Andrea Tacchetti, H. Francis Song, Pedro A. M. Mediano, Vinicius Zambaldi, Neil C. Rabinowitz, Thore Graepel, Matthew Botvinick, Peter W. Battaglia

The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them.

Adaptive Mechanism Design: Learning to Promote Cooperation

4 code implementations11 Jun 2018 Tobias Baumann, Thore Graepel, John Shawe-Taylor

In the future, artificial learning agents are likely to become increasingly widespread in our society.

Re-evaluating Evaluation

1 code implementation NeurIPS 2018 David Balduzzi, Karl Tuyls, Julien Perolat, Thore Graepel

Progress in machine learning is measured by careful evaluation on problems of outstanding common interest.

The Mechanics of n-Player Differentiable Games

1 code implementation ICML 2018 David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.

TrueSkill Through Time: Revisiting the History of Chess

2 code implementations NIPS 2007 Pierre Dangauthier, Ralf Herbrich, Tom Minka, Thore Graepel

We extend the Bayesian skill rating system TrueSkill to infer entire time series of skills of players by smoothing through time instead of filtering.

Time Series

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

1 code implementation NeurIPS 2017 Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL).

reinforcement-learning

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

3 code implementations10 Feb 2017 Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel

We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions.

Multi-agent Reinforcement Learning reinforcement-learning

Learning Shared Representations in Multi-task Reinforcement Learning

no code implementations7 Mar 2016 Diana Borsa, Thore Graepel, John Shawe-Taylor

We investigate a paradigm in multi-task reinforcement learning (MT-RL) in which an agent is placed in an environment and needs to learn to perform a series of tasks, within this space.

reinforcement-learning

The Wreath Process: A totally generative model of geometric shape based on nested symmetries

no code implementations9 Jun 2015 Diana Borsa, Thore Graepel, Andrew Gordon

We consider the problem of modelling noisy but highly symmetric shapes that can be viewed as hierarchies of whole-part relationships in which higher level objects are composed of transformed collections of lower level objects.

SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

1 code implementation19 Jul 2012 Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information.

Cannot find the paper you are looking for? You can Submit a new open access paper.