Search Results for author: Ian Gemp

Found 27 papers, 7 papers with code

Nash Equilibria via Stochastic Eigendecomposition

no code implementations4 Nov 2024 Ian Gemp

This work proposes a novel set of techniques for approximating a Nash equilibrium in a finite, normal-form game.

Soft Condorcet Optimization for Ranking of General Agents

no code implementations31 Oct 2024 Marc Lanctot, Kate Larson, Michael Kaisers, Quentin Berthet, Ian Gemp, Manfred Diaz, Roberto-Rafael Maura-Rivero, Yoram Bachrach, Anna Koop, Doina Precup

This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria.

Convex Markov Games: A Framework for Creativity, Imitation, Fairness, and Safety in Multiagent Learning

no code implementations22 Oct 2024 Ian Gemp, Andreas Haupt, Luke Marris, SiQi Liu, Georgios Piliouras

Behavioral diversity, expert imitation, fairness, safety goals and others give rise to preferences in sequential decision making domains that do not decompose additively across time.

Decision Making Diversity +2

Steering Language Models with Game-Theoretic Solvers

1 code implementation24 Jan 2024 Ian Gemp, Roma Patel, Yoram Bachrach, Marc Lanctot, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, SiQi Liu, Karl Tuyls

Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory.

Imitation Learning Scheduling

Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples

1 code implementation NeurIPS 2023 Marco Jiralerspong, Avishek Joey Bose, Ian Gemp, Chongli Qin, Yoram Bachrach, Gauthier Gidel

The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data.

Density Estimation Diversity

AlphaSnake: Policy Iteration on a Nondeterministic NP-hard Markov Decision Process

no code implementations17 Nov 2022 Kevin Du, Ian Gemp, Yi Wu, Yingying Wu

Reinforcement learning has recently been used to approach well-known NP-hard combinatorial problems in graph theory.

reinforcement-learning Reinforcement Learning +1

Game Theoretic Rating in N-player general-sum games with Equilibria

no code implementations5 Oct 2022 Luke Marris, Marc Lanctot, Ian Gemp, Shayegan Omidshafiei, Stephen Mcaleer, Jerome Connor, Karl Tuyls, Thore Graepel

Rating strategies in a game is an important area of research in game theory and artificial intelligence, and can be applied to any real-world competitive or cooperative setting.

Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

no code implementations22 Sep 2022 Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, SiQi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls

The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks.

Deep Reinforcement Learning reinforcement-learning +1

Stochastic Parallelizable Eigengap Dilation for Large Graph Clustering

no code implementations29 Jul 2022 Elise van der Pol, Ian Gemp, Yoram Bachrach, Richard Everett

A core step of spectral clustering is performing an eigendecomposition of the corresponding graph Laplacian matrix (or equivalently, a singular value decomposition, SVD, of the incidence matrix).

Clustering Decision Making +3

The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium

no code implementations10 Jun 2022 Ian Gemp, Charlie Chen, Brian McWilliams

In this work, we develop a game-theoretic formulation of the top-$k$ SGEP whose Nash equilibrium is the set of generalized eigenvectors.

2k

EigenGame: PCA as a Nash Equilibrium

2 code implementations ICLR 2021 Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel

We present a novel view on principal component analysis (PCA) as a competitive game in which each approximate eigenvector is controlled by a player whose goal is to maximize their own utility function.

Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

1 code implementation6 Jun 2020 Bo Liu, Ian Gemp, Mohammad Ghavamzadeh, Ji Liu, Sridhar Mahadevan, Marek Petrik

In this paper, we introduce proximal gradient temporal difference learning, which provides a principled way of designing and analyzing true stochastic gradient temporal difference learning algorithms.

reinforcement-learning Reinforcement Learning (RL)

Social diversity and social preferences in mixed-motive reinforcement learning

no code implementations6 Feb 2020 Kevin R. McKee, Ian Gemp, Brian McWilliams, Edgar A. Duéñez-Guzmán, Edward Hughes, Joel Z. Leibo

Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity.

Diversity reinforcement-learning +2

Neural Design of Contests and All-Pay Auctions using Multi-Agent Simulation

no code implementations25 Sep 2019 Thomas Anthony, Ian Gemp, Janos Kramar, Tom Eccles, Andrea Tacchetti, Yoram Bachrach

In contrast to auctions designed manually by economists, our method searches the possible design space using a simulation of the multi-agent learning process, and can thus handle settings where a game-theoretic equilibrium analysis is not tractable.

WEAKLY SEMI-SUPERVISED NEURAL TOPIC MODELS

no code implementations ICLR Workshop LLD 2019 Ian Gemp, Ramesh Nallapati, Ran Ding, Feng Nan, Bing Xiang

We extend NTMs to the weakly semi-supervised setting by using informative priors in the training objective.

Topic Models

Global Convergence to the Equilibrium of GANs using Variational Inequalities

no code implementations4 Aug 2018 Ian Gemp, Sridhar Mahadevan

In optimization, the negative gradient of a function denotes the direction of steepest descent.

Generative Adversarial Network

Online Monotone Games

no code implementations19 Oct 2017 Ian Gemp, Sridhar Mahadevan

Algorithmic game theory (AGT) focuses on the design and analysis of algorithms for interacting agents, with interactions rigorously formalized within the framework of games.

Reinforcement Learning Reinforcement Learning (RL)

Generative Multi-Adversarial Networks

1 code implementation5 Nov 2016 Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.

Image Generation

Online Monotone Optimization

no code implementations29 Aug 2016 Ian Gemp, Sridhar Mahadevan

This paper presents a new framework for analyzing and designing no-regret algorithms for dynamic (possibly adversarial) systems.

Inverting Variational Autoencoders for Improved Generative Accuracy

no code implementations21 Aug 2016 Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).

Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces

no code implementations26 May 2014 Sridhar Mahadevan, Bo Liu, Philip Thomas, Will Dabney, Steve Giguere, Nicholas Jacek, Ian Gemp, Ji Liu

In this paper, we set forth a new vision of reinforcement learning developed by us over the past few years, one that yields mathematically rigorous solutions to longstanding important questions that have remained unresolved: (i) how to design reliable, convergent, and robust reinforcement learning algorithms (ii) how to guarantee that reinforcement learning satisfies pre-specified "safety" guarantees, and remains in a stable region of the parameter space (iii) how to design "off-policy" temporal difference learning algorithms in a reliable and stable manner, and finally (iv) how to integrate the study of reinforcement learning into the rich theory of stochastic optimization.

Decision Making reinforcement-learning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.