no code implementations • 7 Dec 2024 • Constantinos Daskalakis, Ian Gemp, Yanchen Jiang, Renato Paes Leme, Christos Papadimitriou, Georgios Piliouras
Stories are records of our experiences and their analysis reveals insights into the nature of being human.
no code implementations • 4 Nov 2024 • Ian Gemp
This work proposes a novel set of techniques for approximating a Nash equilibrium in a finite, normal-form game.
no code implementations • 31 Oct 2024 • Marc Lanctot, Kate Larson, Michael Kaisers, Quentin Berthet, Ian Gemp, Manfred Diaz, Roberto-Rafael Maura-Rivero, Yoram Bachrach, Anna Koop, Doina Precup
This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria.
no code implementations • 22 Oct 2024 • Ian Gemp, Andreas Haupt, Luke Marris, SiQi Liu, Georgios Piliouras
Behavioral diversity, expert imitation, fairness, safety goals and others give rise to preferences in sequential decision making domains that do not decompose additively across time.
1 code implementation • 24 Jan 2024 • Ian Gemp, Roma Patel, Yoram Bachrach, Marc Lanctot, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, SiQi Liu, Karl Tuyls
Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory.
1 code implementation • NeurIPS 2023 • Marco Jiralerspong, Avishek Joey Bose, Ian Gemp, Chongli Qin, Yoram Bachrach, Gauthier Gidel
The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data.
no code implementations • 1 Feb 2023 • Zun Li, Marc Lanctot, Kevin R. McKee, Luke Marris, Ian Gemp, Daniel Hennes, Paul Muller, Kate Larson, Yoram Bachrach, Michael P. Wellman
Multiagent reinforcement learning (MARL) has benefited significantly from population-based and game-theoretic training regimes.
no code implementations • 17 Nov 2022 • Kevin Du, Ian Gemp, Yi Wu, Yingying Wu
Reinforcement learning has recently been used to approach well-known NP-hard combinatorial problems in graph theory.
no code implementations • 17 Oct 2022 • Luke Marris, Ian Gemp, Thomas Anthony, Andrea Tacchetti, SiQi Liu, Karl Tuyls
We argue that such a network is a powerful component for many possible multiagent algorithms.
no code implementations • 5 Oct 2022 • Luke Marris, Marc Lanctot, Ian Gemp, Shayegan Omidshafiei, Stephen Mcaleer, Jerome Connor, Karl Tuyls, Thore Graepel
Rating strategies in a game is an important area of research in game theory and artificial intelligence, and can be applied to any real-world competitive or cooperative setting.
no code implementations • 22 Sep 2022 • Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, SiQi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls
The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks.
no code implementations • 29 Jul 2022 • Elise van der Pol, Ian Gemp, Yoram Bachrach, Richard Everett
A core step of spectral clustering is performing an eigendecomposition of the corresponding graph Laplacian matrix (or equivalently, a singular value decomposition, SVD, of the incidence matrix).
no code implementations • 10 Jun 2022 • Ian Gemp, Charlie Chen, Brian McWilliams
In this work, we develop a game-theoretic formulation of the top-$k$ SGEP whose Nash equilibrium is the set of generalized eigenvectors.
no code implementations • NAACL 2021 • Roma Patel, Marta Garnelo, Ian Gemp, Chris Dyer, Yoram Bachrach
We propose a vocabulary selection method that views words as members of a team trying to maximize the model{'}s performance.
1 code implementation • ICLR 2022 • Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel
We build on the recently proposed EigenGame that views eigendecomposition as a competitive game.
2 code implementations • ICLR 2021 • Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel
We present a novel view on principal component analysis (PCA) as a competitive game in which each approximate eigenvector is controlled by a player whose goal is to maximize their own utility function.
2 code implementations • NeurIPS 2020 • Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, Yoram Bachrach
It also features a large combinatorial action space and simultaneous moves, which are challenging for RL algorithms.
1 code implementation • 6 Jun 2020 • Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, Marek Petrik
In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms.
no code implementations • 6 Feb 2020 • Kevin R. McKee, Ian Gemp, Brian McWilliams, Edgar A. Duéñez-Guzmán, Edward Hughes, Joel Z. Leibo
Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity.
no code implementations • 25 Sep 2019 • Thomas Anthony, Ian Gemp, Janos Kramar, Tom Eccles, Andrea Tacchetti, Yoram Bachrach
In contrast to auctions designed manually by economists, our method searches the possible design space using a simulation of the multi-agent learning process, and can thus handle settings where a game-theoretic equilibrium analysis is not tractable.
no code implementations • ICLR Workshop LLD 2019 • Ian Gemp, Ramesh Nallapati, Ran Ding, Feng Nan, Bing Xiang
We extend NTMs to the weakly semi-supervised setting by using informative priors in the training objective.
no code implementations • 4 Aug 2018 • Ian Gemp, Sridhar Mahadevan
In optimization, the negative gradient of a function denotes the direction of steepest descent.
no code implementations • 19 Oct 2017 • Ian Gemp, Sridhar Mahadevan
Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games.
1 code implementation • 5 Nov 2016 • Ishan Durugkar, Ian Gemp, Sridhar Mahadevan
Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.
no code implementations • 29 Aug 2016 • Ian Gemp, Sridhar Mahadevan
This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems.
no code implementations • 21 Aug 2016 • Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan
Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).
no code implementations • 26 May 2014 • Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu
In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.