no code implementations • 30 Oct 2024 • Zhiyuan Fan, Christian Kroer, Gabriele Farina
Along with a new regret lower bound for online learning in sequence-form strategy spaces, we show that this ratio is nearly optimal.
1 code implementation • 29 Jul 2024 • Mingyang Liu, Gabriele Farina, Asuman Ozdaglar
LiteEFG is an efficient library with easy-to-use Python bindings, which can solve multiplayer extensive-form games (EFGs).
no code implementations • 15 Jun 2024 • Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng
While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix and $\widetilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games.
no code implementations • 19 Dec 2023 • Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm
Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning.
no code implementations • 16 Nov 2023 • Athul Paul Jacob, Gabriele Farina, Jacob Andreas
We present a model of pragmatic language understanding, where utterances are produced and understood by searching for regularized equilibria of signaling games.
no code implementations • 1 Nov 2023 • Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng
Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice.
no code implementations • 13 Oct 2023 • Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas
When applied to question answering and other text generation tasks, language models (LMs) may be queried generatively (by sampling answers from their output distribution) or discriminatively (by using them to score or rank a set of candidate outputs).
no code implementations • 25 Apr 2023 • Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown
Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
1 code implementation • Science 2022 • Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sash Mitts, Aditya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, Markus Zijlstra
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge.
1 code implementation • 11 Oct 2022 • Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown
We then show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL that provides a model of human play while simultaneously training an agent that responds well to this human model.
no code implementations • 20 Aug 2022 • Ioannis Anagnostides, Gabriele Farina, Tuomas Sandholm
In this paper, we establish efficient and uncoupled learning dynamics so that, when employed by all players in multiplayer perfect-recall imperfect-information extensive-form games, the trigger regret of each player grows as $O(\log T)$ after $T$ repetitions of play.
no code implementations • 17 Jun 2022 • Gabriele Farina, Ioannis Anagnostides, Haipeng Luo, Chung-Wei Lee, Christian Kroer, Tuomas Sandholm
In this paper, we answer this in the positive by establishing the first uncoupled learning algorithm with $O(\log T)$ per-player regret in general \emph{convex games}, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets.
1 code implementation • 8 Jun 2022 • Stephen Mcaleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm
DREAM, the only current CFR-based neural method that is model free and therefore scalable to very large games, trains a neural network on an estimated regret target that can have extremely high variance due to an importance sampling term inherited from Monte Carlo CFR (MCCFR).
no code implementations • 25 Apr 2022 • Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Tuomas Sandholm
In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 (T))$.
no code implementations • 14 Mar 2022 • Brian Zhang, Gabriele Farina, Andrea Celli, Tuomas Sandholm
We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games: normal-form coarse correlated equilibrium (NFCCE), extensive-form coarse correlated equilibrium (EFCCE), and extensive-form correlated equilibrium (EFCE).
no code implementations • 1 Feb 2022 • Gabriele Farina, Chung-Wei Lee, Haipeng Luo, Christian Kroer
In this paper we show that the Optimistic Multiplicative Weights Update (OMWU) algorithm -- the premier learning algorithm for NFGs -- can be simulated on the normal-form equivalent of an EFG in linear time per iteration in the game tree size using a kernel trick.
no code implementations • 14 Dec 2021 • Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown
We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.
no code implementations • NeurIPS 2021 • Gabriele Farina, Tuomas Sandholm
In this paper, we initiate the study of equilibrium refinements for settings where one of the players is perfectly rational (the ``machine'') and the other may make mistakes.
no code implementations • 11 Nov 2021 • Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm
Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game.
no code implementations • 29 Sep 2021 • Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Tuomas Sandholm
A recent emerging trend in the literature on learning in games has been concerned with providing accelerated learning dynamics for correlated and coarse correlated equilibria in normal-form games.
no code implementations • 27 May 2021 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria.
no code implementations • 4 Apr 2021 • Gabriele Farina, Andrea Celli, Alberto Marchesi, Nicola Gatti
The existence of simple uncoupled no-regret learning dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems.
no code implementations • 8 Mar 2021 • Gabriele Farina, Robin Schmucker, Tuomas Sandholm
Tree-form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-form interactions between an agent and a potentially adversarial environment.
no code implementations • 8 Mar 2021 • Gabriele Farina, Tuomas Sandholm
We give an efficient algorithm that achieves $O(T^{3/4})$ regret with high probability for that setting, even when the agent faces an adversarial environment.
no code implementations • 21 Sep 2020 • Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm
Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
no code implementations • NeurIPS 2020 • Gabriele Farina, Tuomas Sandholm
As of today, it is known that finding an optimal extensive-form correlated equilibrium (EFCE), extensive-form coarse correlated equilibrium (EFCCE), or normal-form coarse correlated equilibrium (NFCCE) in a two-player extensive-form game is computationally tractable when the game does not include chance moves, and intractable when the game involves chance moves.
no code implementations • 28 Jul 2020 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
In spite of this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms have been preferred in the practice of solving large-scale games (as the local regret minimizers within the counterfactual regret minimization framework).
no code implementations • NeurIPS 2020 • Andrea Celli, Alberto Marchesi, Gabriele Farina, Nicola Gatti
When each player has low trigger regret, the empirical frequency of play is close to an EFCE.
no code implementations • ICML 2020 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
Our framework allows us to instantiate several new stochastic methods for solving sequential games.
no code implementations • NeurIPS 2019 • Gabriele Farina, Chun Kai Ling, Fei Fang, Tuomas Sandholm
We show that a regret minimizer can be designed for a scaled extension of any two convex sets, and that from the decomposition we then obtain a global regret minimizer.
no code implementations • NeurIPS 2019 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
Our algorithms provably converge at a rate of $T^{-1}$, which is superior to prior counterfactual regret minimization algorithms.
no code implementations • 26 Aug 2019 • Gabriele Farina, Tommaso Bianchi, Tuomas Sandholm
Coarse correlation models strategic interactions of rational agents complemented by a correlation device, that is a mediator that can recommend behavior but not enforce it.
no code implementations • 13 Feb 2019 • Gabriele Farina, Christian Kroer, Noam Brown, Tuomas Sandholm
The CFR framework has been a powerful tool for solving large-scale extensive-form games in practice.
no code implementations • NeurIPS 2018 • Gabriele Farina, Nicola Gatti, Tuomas Sandholm
Nash equilibrium strategies have the known weakness that they do not prescribe rational play in situations that are reached with zero probability according to the strategies themselves, for example, if players have made mistakes.
no code implementations • NeurIPS 2018 • Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm
This paper focuses on zero-sum games where a team of players faces an opponent, as is the case, for example, in Bridge, collusion in poker, and many non-recreational applications such as war, where the colluders do not have time or means of communicating during battle, collusion in bidding, where communication during the auction is illegal, and coordinated swindling in public.
no code implementations • 6 Nov 2018 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
We show that local regret minimizers for the simpler sets can be combined with additional regret minimizers into an aggregate regret minimizer for the composite set.
no code implementations • NeurIPS 2018 • Christian Kroer, Gabriele Farina, Tuomas Sandholm
We present, to our knowledge, the first GPU implementation of a first-order method for extensive-form games.
no code implementations • 10 Sep 2018 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games.
no code implementations • 21 Nov 2017 • Christian Kroer, Gabriele Farina, Tuomas Sandholm
We then extend the program to the robust setting for Stackelberg equilibrium under unlimited and under limited lookahead by the opponent.
no code implementations • ICML 2017 • Gabriele Farina, Christian Kroer, Tuomas Sandholm
We use an instantiation of the CFR framework to develop algorithms for solving behaviorally-constrained (and, as a special case, perturbed in the Selten sense) extensive-form games, which allows us to compute approximate Nash equilibrium refinements.
no code implementations • 25 May 2017 • Gabriele Farina, John P. Dickerson, Tuomas Sandholm
A kidney exchange is a centrally-administered barter market where patients swap their willing yet incompatible donors.
no code implementations • 30 Nov 2015 • Massimo Cairo, Gabriele Farina, Romeo Rizzi
In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor.