no code implementations • 1 May 2024 • Anran Hu, Junzi Zhang
MF-OML is the first fully polynomial multi-agent reinforcement learning algorithm for provably solving Nash equilibria (up to mean-field approximation gaps that vanish as the number of players $N$ goes to infinity) beyond variants of zero-sum and potential games.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 16 Jul 2023 • Xin Guo, Lihong Li, Sareh Nabi, Rabih Salhab, Junzi Zhang
Motivated by bid recommendation in online ad auctions, this paper considers a general class of multi-level and multi-agent games, with two major characteristics: one is a large number of anonymous agents, and the other is the intricate interplay between competition and cooperation.
1 code implementation • 4 May 2023 • Ziheng Cheng, Junzi Zhang, Akshay Agrawal, Stephen Boyd
Laplacian regularized stratified models (LRSM) are models that utilize the explicit or implicit network structure of the sub-problems as defined by the categorical features called strata (e. g., age, region, time, forecast horizon, etc.
no code implementations • 19 Oct 2021 • Yuhao Ding, Junzi Zhang, Hyunin Lee, Javad Lavaei
Our result is the first global convergence and sample complexity results for the stochastic entropy-regularized vanilla PG method.
no code implementations • 19 Oct 2021 • Yuhao Ding, Junzi Zhang, Javad Lavaei
For the generic Fisher-non-degenerate policy parametrizations, our result is the first single-loop and finite-batch PG algorithm achieving $\tilde{O}(\epsilon^{-3})$ global optimality sample complexity.
no code implementations • 13 Sep 2021 • Xin Guo, Anran Hu, Junzi Zhang
To our best knowledge, this is the first theoretical guarantee on fictitious discount algorithms for the episodic reinforcement learning of finite-time-horizon MDPs, which also leads to the (first) global convergence of policy gradient methods for finite-time-horizon episodic reinforcement learning.
no code implementations • 22 Oct 2020 • Junzi Zhang, Jongho Kim, Brendan O'Donoghue, Stephen Boyd
Policy gradient methods are among the most effective methods for large-scale reinforcement learning, and their empirical success has prompted several works that develop the foundation of their global convergence theory.
no code implementations • 13 Mar 2020 • Xin Guo, Anran Hu, Renyuan Xu, Junzi Zhang
This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision-making in stochastic games with a large population.
no code implementations • 25 Nov 2018 • Andrea Zanette, Junzi Zhang, Mykel J. Kochenderfer
This paper focuses on the problem of determining as large a region as possible where a function exceeds a given threshold with high probability.