1 code implementation • 17 Aug 2023 • Ali Ramezani-Kebrya, Kimon Antonakopoulos, Igor Krawczuk, Justin Deschenaux, Volkan Cevher
We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local stochastic dual vectors.
no code implementations • 3 Nov 2022 • Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse Dadi, Volkan Cevher
We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure.
no code implementations • 3 Nov 2022 • Kimon Antonakopoulos, Ali Kavis, Volkan Cevher
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions.
no code implementations • 13 Jun 2022 • Yu-Guan Hsieh, Kimon Antonakopoulos, Volkan Cevher, Panayotis Mertikopoulos
We examine the problem of regret minimization when the learner is involved in a continuous game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is possible to achieve significantly lower regret relative to fully adversarial environments.
no code implementations • NeurIPS 2021 • Kimon Antonakopoulos, Thomas Pethick, Ali Kavis, Panayotis Mertikopoulos, Volkan Cevher
Our first result is that the algorithm achieves the optimal rates of convergence for cocoercive problems when the profile of the randomness is known to the optimizer: $\mathcal{O}(1/\sqrt{T})$ for absolute noise profiles, and $\mathcal{O}(1/T)$ for relative ones.
no code implementations • NeurIPS 2021 • Dong Quan Vu, Kimon Antonakopoulos, Panayotis Mertikopoulos
We examine an adaptive learning framework for nonatomic congestion games where the players' cost functions may be subject to exogenous fluctuations (e. g., due to disturbances in the network, variations in the traffic going through a link).
no code implementations • NeurIPS 2021 • Kimon Antonakopoulos, Panayotis Mertikopoulos
We propose a new family of adaptive first-order methods for a class of convex minimization problems that may fail to be Lipschitz continuous or smooth in the standard sense.
no code implementations • NeurIPS 2021 • Kimon Antonakopoulos, Panayotis Mertikopoulos
We propose a new family of adaptive first-order methods for a class of convex minimization problems that may fail to be Lipschitz continuous or smooth in the standard sense.
no code implementations • 26 Apr 2021 • Yu-Guan Hsieh, Kimon Antonakopoulos, Panayotis Mertikopoulos
In game-theoretic learning, several agents are simultaneously following their individual interests, so the environment is non-stationary from each player's perspective.
no code implementations • ICLR 2021 • Kimon Antonakopoulos, E. Veronica Belmega, Panayotis Mertikopoulos
We present a new family of min-max optimization algorithms that automatically exploit the geometry of the gradient data observed at earlier iterations to perform more informative extra-gradient steps in later ones.
no code implementations • ICLR 2020 • Kimon Antonakopoulos, E. Veronica Belmega, Panayotis Mertikopoulos
Motivated by applications to machine learning and imaging science, we study a class of online and stochastic optimization problems with loss functions that are not Lipschitz continuous; in particular, the loss functions encountered by the optimizer could exhibit gradient singularities or be singular themselves.
no code implementations • NeurIPS 2019 • Kimon Antonakopoulos, Veronica Belmega, Panayotis Mertikopoulos
Lipschitz continuity is a central requirement for achieving the optimal O(1/T) rate of convergence in monotone, deterministic variational inequalities (a setting that includes convex minimization, convex-concave optimization, nonatomic games, and many other problems).
no code implementations • 12 Sep 2018 • Ali Ramezani-Kebrya, Kimon Antonakopoulos, Volkan Cevher, Ashish Khisti, Ben Liang
While momentum-based accelerated variants of stochastic gradient descent (SGD) are widely used when training machine learning models, there is little theoretical understanding on the generalization error of such methods.