no code implementations • 27 Feb 2024 • Ilyas Fatkhullin, Niao He
This paper revisits the convergence of Stochastic Mirror Descent (SMD) in the contemporary nonconvex optimization setting.
1 code implementation • 8 Sep 2023 • Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He
Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator.
no code implementations • 2 Jun 2023 • Anas Barakat, Ilyas Fatkhullin, Niao He
We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure.
no code implementations • 3 Feb 2023 • Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He
Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations.
no code implementations • 2 Feb 2022 • Peter Richtárik, Igor Sokolov, Ilyas Fatkhullin, Elnur Gasanov, Zhize Li, Eduard Gorbunov
We propose and study a new class of gradient communication mechanisms for communication-efficient training -- three point compressors (3PC) -- as well as efficient distributed nonconvex optimization algorithms that can take advantage of them.
no code implementations • 7 Oct 2021 • Ilyas Fatkhullin, Igor Sokolov, Eduard Gorbunov, Zhize Li, Peter Richtárik
First proposed by Seide (2014) as a heuristic, error feedback (EF) is a very popular mechanism for enforcing convergence of distributed gradient-based optimization methods enhanced with communication compression strategies based on the application of contractive compression operators.
no code implementations • NeurIPS 2021 • Peter Richtárik, Igor Sokolov, Ilyas Fatkhullin
However, all existing analyses either i) apply to the single node setting only, ii) rely on very strong and often unreasonable assumptions, such global boundedness of the gradients, or iterate-dependent assumptions that cannot be checked a-priori and may not hold in practice, or iii) circumvent these issues via the introduction of additional unbiased compressors, which increase the communication cost.