Search Results for author: Tuomas Sandholm

Found 56 papers, 3 papers with code

Near-Optimal No-Regret Learning for General Convex Games

no code implementations17 Jun 2022 Gabriele Farina, Ioannis Anagnostides, Haipeng Luo, Chung-Wei Lee, Christian Kroer, Tuomas Sandholm

In this paper, we answer this in the positive by establishing the first uncoupled learning algorithm with $O(\log T)$ per-player regret in general \emph{convex games}, that is, games with concave utility functions supported on arbitrary convex and compact strategy sets.

ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret

no code implementations8 Jun 2022 Stephen Mcaleer, Gabriele Farina, Marc Lanctot, Tuomas Sandholm

We show that the variance of the estimated regret of a tabular version of ESCHER with an oracle value function is significantly lower than that of outcome sampling MCCFR and tabular DREAM with an oracle value function.

Uncoupled Learning Dynamics with $O(\log T)$ Swap Regret in Multiplayer Games

no code implementations25 Apr 2022 Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Tuomas Sandholm

In this paper we establish efficient and \emph{uncoupled} learning dynamics so that, when employed by all players in a general-sum multiplayer game, the \emph{swap regret} of each player after $T$ repetitions of the game is bounded by $O(\log T)$, improving over the prior best bounds of $O(\log^4 (T))$.

Structural Analysis of Branch-and-Cut and the Learnability of Gomory Mixed Integer Cuts

no code implementations15 Apr 2022 Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik

These guarantees apply to infinite families of cutting planes, such as the family of Gomory mixed integer cuts, which are responsible for the main breakthrough speedups of integer programming solvers.

Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation

no code implementations14 Mar 2022 Brian Zhang, Gabriele Farina, Andrea Celli, Tuomas Sandholm

For team games, the two-sided column generation approach vastly outperforms standard column generation approaches, making it the state of the art algorithm when the parameter is large.

Differentiable Economics for Randomized Affine Maximizer Auctions

no code implementations6 Feb 2022 Michael Curry, Tuomas Sandholm, John Dickerson

We present an architecture that supports multiple bidders and is perfectly strategyproof, but cannot necessarily represent the optimal mechanism.

Anytime PSRO for Two-Player Zero-Sum Games

no code implementations19 Jan 2022 Stephen Mcaleer, Kevin Wang, John Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, Roy Fox

PSRO is based on the tabular double oracle (DO) method, an algorithm that is guaranteed to converge to a Nash equilibrium, but may increase exploitability from one iteration to the next.

Multi-agent Reinforcement Learning reinforcement-learning

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

no code implementations NeurIPS 2021 Gabriele Farina, Tuomas Sandholm

In this paper, we initiate the study of equilibrium refinements for settings where one of the players is perfectly rational (the ``machine'') and the other may make mistakes.

Improved Sample Complexity Bounds for Branch-and-Cut

no code implementations18 Nov 2021 Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, Ellen Vitercik

If the training set is too small, a configuration may have good performance over the training set but poor performance on future integer programs.

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

no code implementations11 Nov 2021 Ioannis Anagnostides, Constantinos Daskalakis, Gabriele Farina, Maxwell Fishelson, Noah Golowich, Tuomas Sandholm

Recently, Daskalakis, Fishelson, and Golowich (DFG) (NeurIPS`21) showed that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(\textrm{polylog}(T))$ after $T$ repetitions of the game.

Faster No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

no code implementations29 Sep 2021 Ioannis Anagnostides, Gabriele Farina, Christian Kroer, Tuomas Sandholm

A recent emerging trend in the literature on learning in games has been concerned with providing accelerated learning dynamics for correlated and coarse correlated equilibria in normal-form games.

Better Regularization for Sequential Decision Spaces: Fast Convergence Rates for Nash, Correlated, and Team Equilibria

no code implementations27 May 2021 Gabriele Farina, Christian Kroer, Tuomas Sandholm

The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria.

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

no code implementations8 Mar 2021 Gabriele Farina, Tuomas Sandholm

We give an efficient algorithm that achieves $O(T^{3/4})$ regret with high probability for that setting, even when the agent faces an adversarial environment.

Decision Making online learning

Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games

no code implementations8 Mar 2021 Gabriele Farina, Robin Schmucker, Tuomas Sandholm

Tree-form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-form interactions between an agent and a potentially adversarial environment.

Decision Making

Generalization in portfolio-based algorithm selection

no code implementations24 Dec 2020 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

This algorithm configuration procedure works by first selecting a portfolio of diverse algorithm parameter settings, and then, on a given problem instance, using an algorithm selector to choose a parameter setting from the portfolio with strong predicted performance.

Improving Policy-Constrained Kidney Exchange via Pre-Screening

1 code implementation NeurIPS 2020 Duncan C McElfresh, Michael Curry, Tuomas Sandholm, John P Dickerson

In barter exchanges, participants swap goods with one another without exchanging money; exchanges are often facilitated by a central clearinghouse, with the goal of maximizing the aggregate quality (or number) of swaps.

Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games

no code implementations21 Sep 2020 Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm

Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.

Polynomial-Time Computation of Optimal Correlated Equilibria in Two-Player Extensive-Form Games with Public Chance Moves and Beyond

no code implementations NeurIPS 2020 Gabriele Farina, Tuomas Sandholm

As of today, it is known that finding an optimal extensive-form correlated equilibrium (EFCE), extensive-form coarse correlated equilibrium (EFCCE), or normal-form coarse correlated equilibrium (NFCCE) in a two-player extensive-form game is computationally tractable when the game does not include chance moves, and intractable when the game involves chance moves.

Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent

no code implementations28 Jul 2020 Gabriele Farina, Christian Kroer, Tuomas Sandholm

In spite of this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms have been preferred in the practice of solving large-scale games (as the local regret minimizers within the counterfactual regret minimization framework).

Refined bounds for algorithm configuration: The knife-edge of dual class approximability

no code implementations ICML 2020 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

We answer this question for algorithm configuration problems that exhibit a widely-applicable structure: the algorithm's performance as a function of its parameters can be approximated by a "simple" function.

Sparsified Linear Programming for Zero-Sum Equilibrium Finding

no code implementations ICML 2020 Brian Hu Zhang, Tuomas Sandholm

Computational equilibrium finding in large zero-sum extensive-form imperfect-information games has led to significant recent AI breakthroughs.

Efficient exploration of zero-sum stochastic games

no code implementations24 Feb 2020 Carlos Martin, Tuomas Sandholm

We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay, such as in financial or military simulations and computer games.

Efficient Exploration

Stochastic Regret Minimization in Extensive-Form Games

no code implementations ICML 2020 Gabriele Farina, Christian Kroer, Tuomas Sandholm

Our framework allows us to instantiate several new stochastic methods for solving sequential games.

Efficient Regret Minimization Algorithm for Extensive-Form Correlated Equilibrium

no code implementations NeurIPS 2019 Gabriele Farina, Chun Kai Ling, Fei Fang, Tuomas Sandholm

We show that a regret minimizer can be designed for a scaled extension of any two convex sets, and that from the decomposition we then obtain a global regret minimizer.

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

no code implementations NeurIPS 2019 Gabriele Farina, Christian Kroer, Tuomas Sandholm

Our algorithms provably converge at a rate of $T^{-1}$, which is superior to prior counterfactual regret minimization algorithms.

Coarse Correlation in Extensive-Form Games

no code implementations26 Aug 2019 Gabriele Farina, Tommaso Bianchi, Tuomas Sandholm

Coarse correlation models strategic interactions of rational agents complemented by a correlation device, that is a mediator that can recommend behavior but not enforce it.

How much data is sufficient to learn high-performing algorithms? Generalization guarantees for data-driven algorithm design

no code implementations8 Aug 2019 Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, Ellen Vitercik

We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm's average performance over the training set and its expected performance.

Generalization Bounds

Learning to Optimize Computational Resources: Frugal Training with Generalization Guarantees

no code implementations26 May 2019 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

Our algorithm can help compile a configuration portfolio, or it can be used to select the input to a configuration algorithm for finite parameter spaces.

Limited Lookahead in Imperfect-Information Games

no code implementations17 Feb 2019 Christian Kroer, Tuomas Sandholm

We characterize the hardness of finding a Nash equilibrium or an optimal commitment strategy for either player, showing that in some of these variations the problem can be solved in polynomial time while in others it is PPAD-hard, NP-hard, or inapproximable.

Stable-Predictive Optimistic Counterfactual Regret Minimization

no code implementations13 Feb 2019 Gabriele Farina, Christian Kroer, Noam Brown, Tuomas Sandholm

The CFR framework has been a powerful tool for solving large-scale extensive-form games in practice.

Practical exact algorithm for trembling-hand equilibrium refinements in games

no code implementations NeurIPS 2018 Gabriele Farina, Nicola Gatti, Tuomas Sandholm

Nash equilibrium strategies have the known weakness that they do not prescribe rational play in situations that are reached with zero probability according to the strategies themselves, for example, if players have made mistakes.

Ex ante coordination and collusion in zero-sum multi-player extensive-form games

no code implementations NeurIPS 2018 Gabriele Farina, Andrea Celli, Nicola Gatti, Tuomas Sandholm

This paper focuses on zero-sum games where a team of players faces an opponent, as is the case, for example, in Bridge, collusion in poker, and many non-recreational applications such as war, where the colluders do not have time or means of communicating during battle, collusion in bidding, where communication during the auction is illegal, and coordinated swindling in public.

A Unified Framework for Extensive-Form Game Abstraction with Bounds

no code implementations NeurIPS 2018 Christian Kroer, Tuomas Sandholm

In this paper we present a unified framework for analyzing abstractions that can express all types of abstractions and solution concepts used in prior papers with performance guarantees---while maintaining comparable bounds on abstraction quality.

Regret Circuits: Composability of Regret Minimizers

no code implementations6 Nov 2018 Gabriele Farina, Christian Kroer, Tuomas Sandholm

We show that local regret minimizers for the simpler sets can be combined with additional regret minimizers into an aggregate regret minimizer for the composite set.

Deep Counterfactual Regret Minimization

5 code implementations1 Nov 2018 Noam Brown, Adam Lerer, Sam Gross, Tuomas Sandholm

This paper introduces Deep Counterfactual Regret Minimization, a form of CFR that obviates the need for abstraction by instead using deep neural networks to approximate the behavior of CFR in the full game.

Solving Large Sequential Games with the Excessive Gap Technique

no code implementations NeurIPS 2018 Christian Kroer, Gabriele Farina, Tuomas Sandholm

We present, to our knowledge, the first GPU implementation of a first-order method for extensive-form games.

Solving Imperfect-Information Games via Discounted Regret Minimization

3 code implementations11 Sep 2018 Noam Brown, Tuomas Sandholm

Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most popular and, in practice, fastest approach to approximately solving large imperfect-information games.

Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games

no code implementations10 Sep 2018 Gabriele Farina, Christian Kroer, Tuomas Sandholm

Experiments show that our framework leads to algorithms that scale at a rate comparable to the fastest variants of counterfactual regret minimization for computing Nash equilibrium, and therefore our approach leads to the first algorithm for computing quantal response equilibria in extremely large games.

Decision Making

Depth-Limited Solving for Imperfect-Information Games

no code implementations NeurIPS 2018 Noam Brown, Tuomas Sandholm, Brandon Amos

This paper introduces a principled way to conduct depth-limited solving in imperfect-information games by allowing the opponent to choose among a number of strategies for the remainder of the game at the depth limit.

Learning to Branch

no code implementations ICML 2018 Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, Ellen Vitercik

Tree search algorithms recursively partition the search space to find an optimal solution.

Variable Selection

Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead

no code implementations21 Nov 2017 Christian Kroer, Gabriele Farina, Tuomas Sandholm

We then extend the program to the robust setting for Stackelberg equilibrium under unlimited and under limited lookahead by the opponent.

Regret Minimization in Behaviorally-Constrained Zero-Sum Games

no code implementations ICML 2017 Gabriele Farina, Christian Kroer, Tuomas Sandholm

We use an instantiation of the CFR framework to develop algorithms for solving behaviorally-constrained (and, as a special case, perturbed in the Selten sense) extensive-form games, which allows us to compute approximate Nash equilibrium refinements.

Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning

no code implementations ICML 2017 Noam Brown, Tuomas Sandholm

Iterative algorithms such as Counterfactual Regret Minimization (CFR) are the most popular way to solve large zero-sum imperfect-information games.

Operation Frames and Clubs in Kidney Exchange

no code implementations25 May 2017 Gabriele Farina, John P. Dickerson, Tuomas Sandholm

A kidney exchange is a centrally-administered barter market where patients swap their willing yet incompatible donors.

Safe and Nested Subgame Solving for Imperfect-Information Games

no code implementations NeurIPS 2017 Noam Brown, Tuomas Sandholm

Thus a subgame cannot be solved in isolation and must instead consider the strategy for the entire game as a whole, unlike perfect-information games.

Translation

Generalization Guarantees for Multi-item Profit Maximization: Pricing, Auctions, and Randomized Mechanisms

no code implementations29 Apr 2017 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

To answer this question, we uncover structure shared by many pricing, auction, and lottery mechanisms: for any set of buyers' values, profit is piecewise linear in the mechanism's parameters.

Theoretical and Practical Advances on Smoothing for Extensive-Form Games

no code implementations16 Feb 2017 Christian Kroer, Kevin Waugh, Fatma Kilinc-Karzan, Tuomas Sandholm

By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player.

Reduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning

no code implementations ICML 2017 Noam Brown, Tuomas Sandholm

Counterfactual Regret Minimization (CFR) is the most popular iterative algorithm for solving zero-sum imperfect-information games.

Sample Complexity of Automated Mechanism Design

no code implementations NeurIPS 2016 Maria-Florina Balcan, Tuomas Sandholm, Ellen Vitercik

In the traditional economic models, it is assumed that the bidders' valuations are drawn from an underlying distribution and that the auction designer has perfect knowledge of this distribution.

Combinatorial Optimization Learning Theory

Position-Indexed Formulations for Kidney Exchange

no code implementations6 Jun 2016 John P. Dickerson, David F. Manlove, Benjamin Plaut, Tuomas Sandholm, James Trimble

The recent introduction of chains, where a donor without a paired patient triggers a sequence of donations without requiring a kidney in return, increased the efficacy of fielded kidney exchanges---while also dramatically raising the empirical computational hardness of clearing the market in practice.

Hardness of the Pricing Problem for Chains in Barter Exchanges

no code implementations1 Jun 2016 Benjamin Plaut, John P. Dickerson, Tuomas Sandholm

One of the leading techniques has been branch and price, where column generation is used to incrementally bring cycles and chains into the optimization model on an as-needed basis.

Small Representations of Big Kidney Exchange Graphs

no code implementations25 May 2016 John P. Dickerson, Aleksandr M. Kazachkov, Ariel D. Procaccia, Tuomas Sandholm

This growth results in more lives saved, but exacerbates the empirical hardness of the $\mathcal{NP}$-complete problem of optimally matching patients to donors.

Regret-Based Pruning in Extensive-Form Games

no code implementations NeurIPS 2015 Noam Brown, Tuomas Sandholm

CFR is an iterative algorithm that repeatedly traverses the game tree, updating regrets at each information set. We introduce an improvement to CFR that prunes any path of play in the tree, and its descendants, that has negative regret.

Algorithms for Closed Under Rational Behavior (CURB) Sets

no code implementations16 Jan 2014 Michael Benisch, George B. Davis, Tuomas Sandholm

This algorithm serves as a subroutine in a series of polynomial-time algorithms for finding all minimal CURB sets, one minimal CURB set, and the smallest minimal CURB set in a game.

Cannot find the paper you are looking for? You can Submit a new open access paper.