no code implementations • ICML 2020 • Tanner Fiez, Benjamin Chasnov, Lillian Ratliff
Contemporary work on learning in continuous games has commonly overlooked the hierarchical decision-making structure present in machine learning problems formulated as games, instead treating them as simultaneous play games and adopting the Nash equilibrium solution concept.
no code implementations • 16 Feb 2024 • Tanner Fiez, Houssam Nassif, Yu-cheng Chen, Sergio Gamez, Lalit Jain
Adaptive experimental design (AED) methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods.
no code implementations • 2 Feb 2023 • Fanjie Kong, Yuan Li, Houssam Nassif, Tanner Fiez, Ricardo Henao, Shreya Chakrabarti
In digital marketing, experimenting with new website content is one of the key levers to improve customer engagement.
no code implementations • 25 Oct 2022 • Tanner Fiez, Sergio Gamez, Arick Chen, Houssam Nassif, Lalit Jain
Adaptive experimental design methods are increasingly being used in industry as a tool to boost testing throughput or reduce experimentation cost relative to traditional A/B/N testing methods.
no code implementations • NeurIPS 2021 • Tanner Fiez, Lillian Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang
For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.
no code implementations • NeurIPS 2021 • Tanner Fiez, Ryann Sim, Stratis Skoulakis, Georgios Piliouras, Lillian Ratliff
Classical learning results build on this theorem to show that online no-regret dynamics converge to an equilibrium in a time-average sense in zero-sum games.
1 code implementation • 25 Sep 2021 • Liyuan Zheng, Tanner Fiez, Zane Alumbaugh, Benjamin Chasnov, Lillian J. Ratliff
The hierarchical interaction between the actor and critic in actor-critic based reinforcement learning algorithms naturally lends itself to a game-theoretic interpretation.
1 code implementation • ICLR 2022 • Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J. Ratliff
This paper considers minimax optimization $\min_x \max_y f(x, y)$ in the challenging setting where $f$ can be both nonconvex in $x$ and nonconcave in $y$.
no code implementations • NeurIPS 2021 • Tanner Fiez, Lillian J Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang
For the class of nonconvex-PL zero-sum games, we exploit timescale separation to construct a potential function that when combined with the stability characterization and an asymptotic saddle avoidance result gives a global asymptotic almost-sure convergence guarantee to a set of the strict local minmax equilibrium.
1 code implementation • 15 Dec 2020 • Stratis Skoulakis, Tanner Fiez, Ryann Sim, Georgios Piliouras, Lillian Ratliff
The predominant paradigm in evolutionary game theory and more generally online learning in games is based on a clear distinction between a population of dynamic agents that interact given a fixed, static game.
1 code implementation • ICLR 2021 • Tanner Fiez, Lillian Ratliff
In this work, we bridge the gap between past work by showing there exists a finite timescale separation parameter $\tau^{\ast}$ such that $x^{\ast}$ is a stable critical point of gradient descent-ascent for all $\tau \in (\tau^{\ast}, \infty)$ if and only if it is a strict local minmax equilibrium.
1 code implementation • 27 Jun 2020 • Tanner Fiez, Nihar B. Shah, Lillian Ratliff
Theoretically, we show a local optimality guarantee of our algorithm and prove that popular baselines are considerably suboptimal.
1 code implementation • NeurIPS 2019 • Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff
Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost.
1 code implementation • 4 Jun 2019 • Tanner Fiez, Benjamin Chasnov, Lillian J. Ratliff
Using this insight, we develop a gradient-based update for the leader while the follower employs a best response strategy for which each stable critical point is guaranteed to be a Stackelberg equilibrium in zero-sum games.
no code implementations • 6 Jul 2018 • Tanner Fiez, Shreyas Sekar, Liyuan Zheng, Lillian J. Ratliff
The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge.
no code implementations • 11 Mar 2018 • Tanner Fiez, Shreyas Sekar, Lillian J. Ratliff
We analyze these algorithms under two types of smoothed reward feedback at the end of each epoch: a reward that is the discount-average of the discounted rewards within an epoch, and a reward that is the time-average of the rewards within an epoch.