Search Results for author: Karl Tuyls

Found 50 papers, 25 papers with code

Fast computation of Nash Equilibria in Imperfect Information Games

no code implementations • ICML 2020 • Remi Munos, Julien Perolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls

We introduce and analyze a class of algorithms, called Mirror Ascent against an Improved Opponent (MAIO), for computing Nash equilibria in two-player zero-sum games, both in normal form and in sequential imperfect information form.

Paper
Add Code

Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem

no code implementations • 23 Apr 2024 • Raphael Koster, Miruna Pîslar, Andrea Tacchetti, Jan Balaguer, Leqi Liu, Romuald Elie, Oliver P. Hauser, Karl Tuyls, Matt Botvinick, Christopher Summerfield

A canonical social dilemma arises when finite resources are allocated to a group of people, who can choose to either reciprocate with interest, or keep the proceeds for themselves.

Paper
Add Code

States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

1 code implementation • 24 Jan 2024 • Ian Gemp, Yoram Bachrach, Marc Lanctot, Roma Patel, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, SiQi Liu, Karl Tuyls

A suitable model of the players, strategies, and payoffs associated with linguistic interactions (i. e., a binding to the conventional symbolic logic of game theory) would enable existing game-theoretic algorithms to provide strategic solutions in the space of language.

Imitation Learning

3,998

Paper
Code

Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

no code implementations • 23 Oct 2023 • Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective.

Multi-agent Reinforcement Learning Multi-Armed Bandits +1

Paper
Add Code

TacticAI: an AI assistant for football tactics

no code implementations • 16 Oct 2023 • Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, William Spearman, Ian Graham, Jerome Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess, Michael Bowling, Demis Hassabis, Karl Tuyls

The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC.

Retrieval

Paper
Add Code

Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas

no code implementations • 1 May 2023 • Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán

In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others.

Zero-shot Generalization

Paper
Add Code

Human-Timescale Adaptation in an Open-Ended Task Space

no code implementations • 18 Jan 2023 • Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang

Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL).

In-Context Learning Meta Reinforcement Learning +3

Paper
Add Code

An Analysis of Quantile Temporal-Difference Learning

no code implementations • 11 Jan 2023 • Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers

no code implementations • 17 Oct 2022 • Luke Marris, Ian Gemp, Thomas Anthony, Andrea Tacchetti, SiQi Liu, Karl Tuyls

We argue that such a network is a powerful component for many possible multiagent algorithms.

Zero-shot Generalization

Paper
Add Code

Game Theoretic Rating in N-player general-sum games with Equilibria

no code implementations • 5 Oct 2022 • Luke Marris, Marc Lanctot, Ian Gemp, Shayegan Omidshafiei, Stephen Mcaleer, Jerome Connor, Karl Tuyls, Thore Graepel

Rating strategies in a game is an important area of research in game theory and artificial intelligence, and can be applied to any real-world competitive or cooperative setting.

Paper
Add Code

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

no code implementations • 22 Sep 2022 • Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, SiQi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning Correlated Equilibria in Mean-Field Games

no code implementations • 22 Aug 2022 • Paul Muller, Romuald Elie, Mark Rowland, Mathieu Lauriere, Julien Perolat, Sarah Perrin, Matthieu Geist, Georgios Piliouras, Olivier Pietquin, Karl Tuyls

The designs of many large-scale systems today, from traffic routing environments to smart grids, rely on game-theoretic equilibrium concepts.

Paper
Add Code

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

1 code implementation • 30 Jun 2022 • Julien Perolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen Mcaleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent SIfre, Nathalie Beauguerlange, Remi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls

It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes).

Board Games Decision Making +2

3,998

Paper
Code

Statistical discrimination in learning agents

no code implementations • 21 Oct 2021 • Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo

Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics.

Decision Making Multi-agent Reinforcement Learning

Paper
Add Code

Evolutionary Dynamics and $Φ$-Regret Minimization in Games

no code implementations • 28 Jun 2021 • Georgios Piliouras, Mark Rowland, Shayegan Omidshafiei, Romuald Elie, Daniel Hennes, Jerome Connor, Karl Tuyls

Importantly, $\Phi$-regret enables learning agents to consider deviations from and to mixed strategies, generalizing several existing notions of regret such as external, internal, and swap regret, and thus broadening the insights gained from regret-based analysis of learning algorithms.

Paper
Add Code

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

1 code implementation • 17 Jun 2021 • Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel

Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting.

3,998

Paper
Code

Time-series Imputation of Temporally-occluded Multiagent Trajectories

no code implementations • 8 Jun 2021 • Shayegan Omidshafiei, Daniel Hennes, Marta Garnelo, Eugene Tarassov, Zhe Wang, Romuald Elie, Jerome T. Connor, Paul Muller, Ian Graham, William Spearman, Karl Tuyls

In multiagent environments, several decision-making individuals interact while adhering to the dynamics constraints imposed by the environment.

Collision Avoidance Decision Making +4

Paper
Add Code

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation • 25 May 2021 • SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Imitation Learning Multi-agent Reinforcement Learning +1

3,540

Paper
Code

Scaling up Mean Field Games with Online Mirror Descent

1 code implementation • 28 Feb 2021 • Julien Perolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin

We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD).

3,998

Paper
Code

Game Plan: What AI can do for Football, and What Football can do for AI

1 code implementation • 18 Nov 2020 • Karl Tuyls, Shayegan Omidshafiei, Paul Muller, Zhe Wang, Jerome Connor, Daniel Hennes, Ian Graham, William Spearman, Tim Waskett, Dafydd Steele, Pauline Luc, Adria Recasens, Alexandre Galashov, Gregory Thornton, Romuald Elie, Pablo Sprechmann, Pol Moreno, Kris Cao, Marta Garnelo, Praneet Dutta, Michal Valko, Nicolas Heess, Alex Bridgland, Julien Perolat, Bart De Vylder, Ali Eslami, Mark Rowland, Andrew Jaegle, Remi Munos, Trevor Back, Razia Ahamed, Simon Bouton, Nathalie Beauguerlange, Jackson Broshear, Thore Graepel, Demis Hassabis

The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis.

BIG-bench Machine Learning counterfactual +1

Paper
Code

The Advantage Regret-Matching Actor-Critic

no code implementations • 27 Aug 2020 • Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls

In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior.

counterfactual Reinforcement Learning (RL)

Paper
Add Code

Navigating the Landscape of Multiplayer Games

no code implementations • 4 May 2020 • Shayegan Omidshafiei, Karl Tuyls, Wojciech M. Czarnecki, Francisco C. Santos, Mark Rowland, Jerome Connor, Daniel Hennes, Paul Muller, Julien Perolat, Bart De Vylder, Audrunas Gruslys, Remi Munos

Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence.

Paper
Add Code

Real World Games Look Like Spinning Tops

1 code implementation • NeurIPS 2020 • Wojciech Marian Czarnecki, Gauthier Gidel, Brendan Tracey, Karl Tuyls, Shayegan Omidshafiei, David Balduzzi, Max Jaderberg

This paper investigates the geometrical properties of real world games (e. g. Tic-Tac-Toe, Go, StarCraft II).

Clustering Starcraft +1

Paper
Code

The Automated Inspection of Opaque Liquid Vaccines

no code implementations • 21 Feb 2020 • Gregory Palmer, Benjamin Schnieders, Rahul Savani, Karl Tuyls, Joscha-David Fossel, Harry Flore

We train 3D-ConvNets to predict the likelihood of 20-frame video samples containing anomalies.

Paper
Add Code

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

no code implementations • 19 Feb 2020 • Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls

In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG).

Paper
Add Code

A Generalized Training Approach for Multiagent Learning

1 code implementation • ICLR 2020 • Paul Muller, Shayegan Omidshafiei, Mark Rowland, Karl Tuyls, Julien Perolat, Si-Qi Liu, Daniel Hennes, Luke Marris, Marc Lanctot, Edward Hughes, Zhe Wang, Guy Lever, Nicolas Heess, Thore Graepel, Remi Munos

This paper investigates a population-based training regime based on game-theoretic principles called Policy-Spaced Response Oracles (PSRO).

3,998

Paper
Code

Multiagent Evaluation under Incomplete Information

1 code implementation • NeurIPS 2019 • Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

This paper investigates the evaluation of learned multiagent strategies in the incomplete information setting, which plays a critical role in ranking and training of agents.

Paper
Code

OpenSpiel: A Framework for Reinforcement Learning in Games

15 code implementations • 26 Aug 2019 • Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

General Reinforcement Learning reinforcement-learning +1

3,998

Paper
Code

Neural Replicator Dynamics

1 code implementation • 1 Jun 2019 • Daniel Hennes, Dustin Morrill, Shayegan Omidshafiei, Remi Munos, Julien Perolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Paavo Parmas, Edgar Duenez-Guzman, Karl Tuyls

Policy gradient and actor-critic algorithms form the basis of many commonly used training techniques in deep reinforcement learning.

counterfactual Policy Gradient Methods

Paper
Code

Differentiable Game Mechanics

1 code implementation • 13 May 2019 • Alistair Letcher, David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games.

149

Paper
Code

Deep reinforcement learning with relational inductive biases

no code implementations • ICLR 2019 • Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for augmenting model-free deep reinforcement learning agents with a mechanism for relational reasoning over structured representations, which improves performance, learning efficiency, generalization, and interpretability.

reinforcement-learning Reinforcement Learning (RL) +3

Paper
Add Code

Evolving Indoor Navigational Strategies Using Gated Recurrent Units In NEAT

no code implementations • 12 Apr 2019 • James Butterworth, Rahul Savani, Karl Tuyls

Simultaneous Localisation and Mapping (SLAM) algorithms are expensive to run on smaller robotic platforms such as Micro-Aerial Vehicles.

Paper
Add Code

Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

no code implementations • 13 Mar 2019 • Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls

In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents.

counterfactual

Paper
Add Code

α-Rank: Multi-Agent Evaluation by Evolution

1 code implementation • 4 Mar 2019 • Shayegan Omidshafiei, Christos Papadimitriou, Georgios Piliouras, Karl Tuyls, Mark Rowland, Jean-Baptiste Lespiau, Wojciech M. Czarnecki, Marc Lanctot, Julien Perolat, Remi Munos

We introduce {\alpha}-Rank, a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs).

Mathematical Proofs

3,998

Paper
Code

Robust Temporal Difference Learning for Critical Domains

no code implementations • 23 Jan 2019 • Richard Klima, Daan Bloembergen, Michael Kaisers, Karl Tuyls

We prove convergence of the operator to the optimal robust Q-function with respect to the model using the theory of Generalized Markov Decision Processes.

Paper
Add Code

Actor-Critic Policy Optimization in Partially Observable Multiagent Environments

1 code implementation • NeurIPS 2018 • Sriram Srinivasan, Marc Lanctot, Vinicius Zambaldi, Julien Perolat, Karl Tuyls, Remi Munos, Michael Bowling

Optimization of parameterized policies for reinforcement learning (RL) is an important and challenging problem in artificial intelligence.

reinforcement-learning Reinforcement Learning (RL)

3,998

Paper
Code

SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions

no code implementations • 18 Sep 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen

Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Negative Update Intervals in Deep Multi-Agent Reinforcement Learning

1 code implementation • 13 Sep 2018 • Gregory Palmer, Rahul Savani, Karl Tuyls

For instance, hysteretic Q-learning addresses miscoordination while leaving agents vulnerable towards misleading stochastic rewards.

Multi-agent Reinforcement Learning Q-Learning +2

Paper
Code

Fast Convergence for Object Detection by Learning how to Combine Error Functions

no code implementations • 13 Aug 2018 • Benjamin Schnieders, Karl Tuyls

Compared to state-of-the-art task weighting methods, the improvement is 24. 5% in convergence, and 15. 8% on the estimated pickup rate.

object-detection Object Detection

Paper
Add Code

Re-evaluating Evaluation

2 code implementations • NeurIPS 2018 • David Balduzzi, Karl Tuyls, Julien Perolat, Thore Graepel

Progress in machine learning is measured by careful evaluation on problems of outstanding common interest.

3,998

Paper
Code

Relational Deep Reinforcement Learning

7 code implementations • 5 Jun 2018 • Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning.

reinforcement-learning Reinforcement Learning (RL) +3

555

Paper
Code

Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input

1 code implementation • ICLR 2018 • Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen Clark

The ability of algorithms to evolve or learn (compositional) communication protocols has traditionally been studied in the language evolution literature through the use of emergent communication tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Emergent Communication through Negotiation

1 code implementation • ICLR 2018 • Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, Stephen Clark

We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.

Multi-agent Reinforcement Learning

Paper
Code

Inequity aversion improves cooperation in intertemporal social dilemmas

3 code implementations • NeurIPS 2018 • Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel

Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas.

Multi-agent Reinforcement Learning

373

Paper
Code

SA-IGA: A Multiagent Reinforcement Learning Method Towards Socially Optimal Outcomes

no code implementations • 8 Mar 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Wanli Xue

In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment.

Q-Learning reinforcement-learning +1

Paper
Add Code

The Mechanics of n-Player Differentiable Games

1 code implementation • ICML 2018 • David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.

149

Paper
Code

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

1 code implementation • NeurIPS 2017 • Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL).

reinforcement-learning Reinforcement Learning (RL)

3,998

Paper
Code

A multi-agent reinforcement learning model of common-pool resource appropriation

4 code implementations • NeurIPS 2017 • Julien Perolat, Joel Z. Leibo, Vinicius Zambaldi, Charles Beattie, Karl Tuyls, Thore Graepel

Here we show that deep reinforcement learning can be used instead.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Lenient Multi-Agent Deep Reinforcement Learning

1 code implementation • 14 Jul 2017 • Gregory Palmer, Karl Tuyls, Daan Bloembergen, Rahul Savani

We find that LDQN agents are more likely to converge to the optimal policy in a stochastic reward CMOTP compared to standard and scheduled-HDQN agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Value-Decomposition Networks For Cooperative Multi-Agent Learning

8 code implementations • 16 Jun 2017 • Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.

Ranked #1 on SMAC+ on Off_Superhard_parallel

Multi-agent Reinforcement Learning reinforcement-learning +2

152

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.