Search Results for author: Yang Cai

Found 20 papers, 2 papers with code

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games

no code implementations4 Mar 2025 Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

Non-ergodic convergence of learning dynamics in games is widely studied recently because of its importance in both theory and practice.

On the Convergence of Min-Max Langevin Dynamics and Algorithm

no code implementations29 Dec 2024 Yang Cai, Siddharth Mitra, Xiuyuan Wang, Andre Wibisono

We prove an exponential convergence guarantee for the mean-field min-max Langevin dynamics to compute the equilibrium distribution of the zero-sum game.

COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences

1 code implementation30 Oct 2024 Yixin Liu, Argyris Oikonomou, Weiqiang Zheng, Yang Cai, Arman Cohan

To achieve robust alignment with general preferences, we model the alignment problem as a two-player zero-sum game, where the Nash equilibrium policy guarantees a 50% win rate against any competing policy.

Language Modeling Language Modelling

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms

no code implementations15 Jun 2024 Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

While both algorithms enjoy $O(1/T)$ ergodic convergence to Nash equilibrium in two-player zero-sum games, OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix and $\widetilde{O}(1/T)$ convergence to coarse correlated equilibria even in general-sum games.

On Tractable $Φ$-Equilibria in Non-Concave Games

no code implementations13 Mar 2024 Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both.

Multi-Scale Semantic Segmentation with Modified MBConv Blocks

no code implementations7 Feb 2024 Xi Chen, Yang Cai, Yuan Wu, Bo Xiong, Taesung Park

Recently, MBConv blocks, initially designed for efficiency in resource-limited settings and later adapted for cutting-edge image classification performances, have demonstrated significant potential in image classification tasks.

Classification Image Classification +2

Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games

no code implementations26 Jan 2024 Yang Cai, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng

In this paper, we improve both results significantly by providing an uncoupled policy optimization algorithm that attains a near-optimal $\tilde{O}(T^{-1})$ convergence rate for computing a correlated equilibrium.

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games

no code implementations1 Nov 2023 Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng

Despite their widespread use for solving real games, virtually nothing is known about their last-iterate convergence.

Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds

no code implementations29 Jun 2023 Yang Cai, Michael I. Jordan, Tianyi Lin, Argyris Oikonomou, Emmanouil-Vasileios Vlatakis-Gkaragkounis

Numerous applications in machine learning and data analytics can be formulated as equilibrium computation over Riemannian manifolds.

User Response in Ad Auctions: An MDP Formulation of Long-Term Revenue Optimization

no code implementations16 Feb 2023 Yang Cai, Zhe Feng, Christopher Liaw, Aranyak Mehta, Grigoris Velegkas

We characterize the optimal mechanism for this MDP as a Myerson's auction with a notion of modified virtual value, which relies on the value distribution of the advertiser, the current user state, and the future impact of showing the ad to the user.

Doubly Optimal No-Regret Learning in Monotone Games

1 code implementation30 Jan 2023 Yang Cai, Weiqiang Zheng

We propose the accelerated optimistic gradient (AOG) algorithm, the first doubly optimal no-regret learning algorithm for smooth monotone games.

Accelerated Single-Call Methods for Constrained Min-Max Optimization

no code implementations6 Oct 2022 Yang Cai, Weiqiang Zheng

Finally, we show that the Reflected Gradient (RG) method, another single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ last-iterate convergence rate for constrained convex-concave min-max optimization, answering an open problem of [Heish et al, 2019].

Accelerated Algorithms for Constrained Nonconvex-Nonconcave Min-Max Optimization and Comonotone Inclusion

no code implementations10 Jun 2022 Yang Cai, Argyris Oikonomou, Weiqiang Zheng

In our first contribution, we extend the Extra Anchored Gradient (EAG) algorithm, originally proposed by Yoon and Ryu (2021) for unconstrained min-max optimization, to constrained comonotone min-max optimization and comonotone inclusion, achieving an optimal convergence rate of $O\left(\frac{1}{T}\right)$ among all first-order methods.

Tight Last-Iterate Convergence of the Extragradient and the Optimistic Gradient Descent-Ascent Algorithm for Constrained Monotone Variational Inequalities

no code implementations20 Apr 2022 Yang Cai, Argyris Oikonomou, Weiqiang Zheng

We use the tangent residual (or a slight variation of the tangent residual) as the the potential function in our analysis of the extragradient algorithm (or the optimistic gradient descent-ascent algorithm) and prove that it is non-increasing between two consecutive iterates.

Recommender Systems meet Mechanism Design

no code implementations25 Oct 2021 Yang Cai, Constantinos Daskalakis

We propose a mechanism design framework for this setting, building on a recent robustification framework by Brustle et al., which disentangles the statistical challenge of estimating a multi-dimensional prior from the task of designing a good mechanism for it, and robustifies the performance of the latter against the estimation error of the former.

Recommendation Systems Topic Models

Multi-Item Mechanisms without Item-Independence: Learnability via Robustness

no code implementations6 Nov 2019 Johaness Brustle, Yang Cai, Constantinos Daskalakis

When item values are sampled from more general graphical models, we combine our robustness theorem with novel sample complexity results for learning Markov Random Fields or Bayesian Networks in Prokhorov distance, which may be of independent interest.

Learning Safe Policies with Expert Guidance

no code implementations NeurIPS 2018 Jessie Huang, Fa Wu, Doina Precup, Yang Cai

We propose a framework for ensuring safe behavior of a reinforcement learning agent when the reward function may be difficult to specify.

reinforcement-learning Reinforcement Learning +1

Learning Multi-item Auctions with (or without) Samples

no code implementations1 Sep 2017 Yang Cai, Constantinos Daskalakis

The second is a more general max-min learning setting that we introduce, where we are given "approximate distributions," and we seek to compute an auction whose revenue is approximately optimal simultaneously for all "true distributions" that are close to the given ones.

Optimum Statistical Estimation with Strategic Data Sources

no code implementations11 Aug 2014 Yang Cai, Constantinos Daskalakis, Christos H. Papadimitriou

We propose an optimum mechanism for providing monetary incentives to the data sources of a statistical estimator such as linear regression, so that high quality data is provided at low cost, in the sense that the sum of payments and estimation error is minimized.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.