Search Results for author: Elnur Gasanov

Found 9 papers, 1 papers with code

From Local SGD to Local Fixed Point Methods for Federated Learning

no code implementations • ICML 2020 • Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richtarik

Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed point algorithms.

Federated Learning

Paper
Add Code

Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants

no code implementations • 16 Feb 2024 • Peter Richtárik, Elnur Gasanov, Konstantin Burlachenko

Error Feedback (EF) is a highly popular and immensely effective mechanism for fixing convergence issues which arise in distributed training methods (such as distributed GD or SGD) when these are enhanced with greedy communication compression techniques such as TopK.

Paper
Add Code

Understanding Progressive Training Through the Framework of Randomized Coordinate Descent

no code implementations • 6 Jun 2023 • Rafał Szlendak, Elnur Gasanov, Peter Richtárik

We propose a Randomized Progressive Training algorithm (RPT) -- a stochastic proxy for the well-known Progressive Training method (PT) (Karras et al., 2017).

Paper
Add Code

Error Feedback Shines when Features are Rare

1 code implementation • 24 May 2023 • Peter Richtárik, Elnur Gasanov, Konstantin Burlachenko

To illustrate our main result, we show that in order to find a random vector $\hat{x}$ such that $\lVert {\nabla f(\hat{x})} \rVert^2 \leq \varepsilon$ in expectation, ${\color{green}\sf GD}$ with the ${\color{green}\sf Top1}$ sparsifier and ${\color{green}\sf EF}$ requires ${\cal O} \left(\left( L+{\color{blue}r} \sqrt{ \frac{{\color{red}c}}{n} \min \left( \frac{{\color{red}c}}{n} \max_i L_i^2, \frac{1}{n}\sum_{i=1}^n L_i^2 \right) }\right) \frac{1}{\varepsilon} \right)$ bits to be communicated by each worker to the server only, where $L$ is the smoothness constant of $f$, $L_i$ is the smoothness constant of $f_i$, ${\color{red}c}$ is the maximal number of clients owning any feature ($1\leq {\color{red}c} \leq n$), and ${\color{blue}r}$ is the maximal number of features owned by any client ($1\leq {\color{blue}r} \leq d$).

Distributed Optimization

Paper
Code

Adaptive Compression for Communication-Efficient Distributed Training

no code implementations • 31 Oct 2022 • Maksim Makarenko, Elnur Gasanov, Rustem Islamov, Abdurakhmon Sadiev, Peter Richtarik

We propose Adaptive Compressed Gradient Descent (AdaCGD) - a novel optimization algorithm for communication-efficient training of supervised machine learning models with adaptive compression level.

Quantization

Paper
Add Code

3PC: Three Point Compressors for Communication-Efficient Distributed Training and a Better Theory for Lazy Aggregation

no code implementations • 2 Feb 2022 • Peter Richtárik, Igor Sokolov, Ilyas Fatkhullin, Elnur Gasanov, Zhize Li, Eduard Gorbunov

We propose and study a new class of gradient communication mechanisms for communication-efficient training -- three point compressors (3PC) -- as well as efficient distributed nonconvex optimization algorithms that can take advantage of them.

Paper
Add Code

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

no code implementations • NeurIPS 2021 • Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik

We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network whose links are allowed to change in time.

Paper
Add Code

FLIX: A Simple and Communication-Efficient Alternative to Local Methods in Federated Learning

no code implementations • 22 Nov 2021 • Elnur Gasanov, Ahmed Khaled, Samuel Horváth, Peter Richtárik

A persistent problem in federated learning is that it is not clear what the optimization objective should be: the standard average risk minimization of supervised learning is inadequate in handling several major constraints specific to federated learning, such as communication adaptivity and personalization control.

Distributed Optimization Federated Learning

Paper
Add Code

From Local SGD to Local Fixed-Point Methods for Federated Learning

no code implementations • 3 Apr 2020 • Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richtárik

Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed-point algorithms.

Federated Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.