Search Results for author: Dmitry Kovalev

Found 28 papers, 5 papers with code

From Local SGD to Local Fixed Point Methods for Federated Learning

no code implementations • ICML 2020 • Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richtarik

Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed point algorithms.

Federated Learning

Paper
Add Code

Acceleration for Compressed Gradient Descent in Distributed Optimization

no code implementations • ICML 2020 • Zhize Li, Dmitry Kovalev, Xun Qian, Peter Richtarik

Due to the high communication cost in distributed and federated learning problems, methods relying on sparsification or quantization of communicated messages are becoming increasingly popular.

Distributed Optimization Federated Learning +1

Paper
Add Code

An Optimal Algorithm for Strongly Convex Min-min Optimization

no code implementations • 29 Dec 2022 • Alexander Gasnikov, Dmitry Kovalev, Grigory Malinovsky

In this paper we study the smooth strongly convex minimization problem $\min_{x}\min_y f(x, y)$.

Paper
Add Code

Smooth Monotone Stochastic Variational Inequalities and Saddle Point Problems: A Survey

no code implementations • 29 Aug 2022 • Aleksandr Beznosikov, Boris Polyak, Eduard Gorbunov, Dmitry Kovalev, Alexander Gasnikov

This paper is a survey of methods for solving smooth (strongly) monotone stochastic variational inequalities.

Paper
Add Code

Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with Inexact Prox

no code implementations • 8 Jul 2022 • Abdurakhmon Sadiev, Dmitry Kovalev, Peter Richtárik

Inspired by a recent breakthrough of Mishchenko et al (2022), who for the first time showed that local gradient steps can lead to provable communication acceleration, we propose an alternative algorithm which obtains the same communication acceleration as their method (ProxSkip).

Federated Learning

Paper
Add Code

On Scaled Methods for Saddle Point Problems

no code implementations • 16 Jun 2022 • Aleksandr Beznosikov, Aibek Alanov, Dmitry Kovalev, Martin Takáč, Alexander Gasnikov

Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training.

Paper
Add Code

Optimal Gradient Sliding and its Application to Distributed Optimization Under Similarity

no code implementations • 30 May 2022 • Dmitry Kovalev, Aleksandr Beznosikov, Ekaterina Borodich, Alexander Gasnikov, Gesualdo Scutari

Finally the method is extended to distributed saddle-problems (under function similarity) by means of solving a class of variational inequalities, achieving lower communication and computation complexity bounds.

Distributed Optimization

Paper
Add Code

The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization

1 code implementation • 19 May 2022 • Dmitry Kovalev, Alexander Gasnikov

Arjevani et al. (2019) established the lower bound $\Omega\left(\epsilon^{-2/(3p+1)}\right)$ on the number of the $p$-th order oracle calls required by an algorithm to find an $\epsilon$-accurate solution to the problem, where the $p$-th order oracle stands for the computation of the objective function value and the derivatives up to the order $p$.

Open-Ended Question Answering

Paper
Code

The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization

no code implementations • 11 May 2022 • Dmitry Kovalev, Alexander Gasnikov

However, the existing state-of-the-art methods do not match this lower bound: algorithms of Lin et al. (2020) and Wang and Li (2020) have gradient evaluation complexity $\mathcal{O}\left( \sqrt{\kappa_x\kappa_y}\log^3\frac{1}{\epsilon}\right)$ and $\mathcal{O}\left( \sqrt{\kappa_x\kappa_y}\log^3 (\kappa_x\kappa_y)\log\frac{1}{\epsilon}\right)$, respectively.

Paper
Add Code

Similarity learning for wells based on logging data

no code implementations • 11 Feb 2022 • Evgenia Romanenkova, Alina Rogulina, Anuar Shakirov, Nikolay Stulov, Alexey Zaytsev, Leyla Ismailova, Dmitry Kovalev, Klemens Katterbauer, Abdallah AlShehri

The essence of the interwell correlation constitutes an assessment of the similarities between geological profiles.

Paper
Add Code

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

no code implementations • 6 Feb 2022 • Dmitry Kovalev, Aleksandr Beznosikov, Abdurakhmon Sadiev, Michael Persiianov, Peter Richtárik, Alexander Gasnikov

Our algorithms are the best among the available literature not only in the decentralized stochastic case, but also in the decentralized deterministic and non-distributed stochastic cases.

Paper
Add Code

Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling

no code implementations • 30 Dec 2021 • Dmitry Kovalev, Alexander Gasnikov, Peter Richtárik

In this paper we study the convex-concave saddle-point problem $\min_x \max_y f(x) + y^T \mathbf{A} x - g(y)$, where $f(x)$ and $g(y)$ are smooth and convex functions.

Paper
Add Code

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

no code implementations • NeurIPS 2021 • Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik

We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network whose links are allowed to change in time.

Paper
Add Code

An Optimal Algorithm for Strongly Convex Minimization under Affine Constraints

no code implementations • 22 Feb 2021 • Adil Salim, Laurent Condat, Dmitry Kovalev, Peter Richtárik

Optimization problems under affine constraints appear in various areas of machine learning.

Optimization and Control

Paper
Add Code

ADOM: Accelerated Decentralized Optimization Method for Time-Varying Networks

no code implementations • 18 Feb 2021 • Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Alexander Rogozin, Alexander Gasnikov

We propose ADOM - an accelerated method for smooth and strongly convex decentralized optimization over time-varying networks.

Paper
Add Code

IntSGD: Adaptive Floatless Compression of Stochastic Gradients

1 code implementation • ICLR 2022 • Konstantin Mishchenko, Bokun Wang, Dmitry Kovalev, Peter Richtárik

We propose a family of adaptive integer compression operators for distributed Stochastic Gradient Descent (SGD) that do not communicate a single float.

Paper
Code

Decentralized Distributed Optimization for Saddle Point Problems

no code implementations • 15 Feb 2021 • Alexander Rogozin, Alexander Beznosikov, Darina Dvinskikh, Dmitry Kovalev, Pavel Dvurechensky, Alexander Gasnikov

We consider distributed convex-concave saddle point problems over arbitrary connected undirected networks and propose a decentralized distributed algorithm for their solution.

Distributed Optimization Optimization and Control Distributed, Parallel, and Cluster Computing

Paper
Add Code

A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free!

no code implementations • 3 Nov 2020 • Dmitry Kovalev, Anastasia Koloskova, Martin Jaggi, Peter Richtarik, Sebastian U. Stich

Decentralized optimization methods enable on-device training of machine learning models without a central coordinator.

Quantization

Paper
Add Code

Linearly Converging Error Compensated SGD

1 code implementation • NeurIPS 2020 • Eduard Gorbunov, Dmitry Kovalev, Dmitry Makarenko, Peter Richtárik

Moreover, using our general scheme, we develop new variants of SGD that combine variance reduction or arbitrary sampling with error feedback and quantization and derive the convergence rates for these methods beating the state-of-the-art results.

Quantization

Paper
Code

Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization

no code implementations • NeurIPS 2020 • Dmitry Kovalev, Adil Salim, Peter Richtárik

We propose two new algorithms for this decentralized optimization problem and equip them with complexity guarantees.

Paper
Add Code

From Local SGD to Local Fixed-Point Methods for Federated Learning

no code implementations • 3 Apr 2020 • Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richtárik

Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed-point algorithms.

Federated Learning

Paper
Add Code

Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization

no code implementations • 26 Feb 2020 • Zhize Li, Dmitry Kovalev, Xun Qian, Peter Richtárik

Due to the high communication cost in distributed and federated learning problems, methods relying on compression of communicated messages are becoming increasingly popular.

Federated Learning

Paper
Add Code

Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems

no code implementations • ICML 2020 • Filip Hanzely, Dmitry Kovalev, Peter Richtarik

We propose an accelerated version of stochastic variance reduced coordinate descent -- ASVRCD.

Paper
Add Code

Distributed Fixed Point Methods with Compressed Iterates

no code implementations • 20 Dec 2019 • Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč

We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed.

Federated Learning

Paper
Add Code

Stochastic Newton and Cubic Newton Methods with Simple Local Linear-Quadratic Rates

1 code implementation • 3 Dec 2019 • Dmitry Kovalev, Konstantin Mishchenko, Peter Richtárik

We present two new remarkably simple stochastic second-order methods for minimizing the average of a very large number of sufficiently smooth and strongly convex functions.

Second-order methods

Paper
Code

Stochastic Proximal Langevin Algorithm: Potential Splitting and Nonasymptotic Rates

1 code implementation • NeurIPS 2019 • Adil Salim, Dmitry Kovalev, Peter Richtárik

We propose a new algorithm---Stochastic Proximal Langevin Algorithm (SPLA)---for sampling from a log concave distribution.

Paper
Code

Revisiting Stochastic Extragradient

no code implementations • 27 May 2019 • Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin, Peter Richtárik, Yura Malitsky

We fix a fundamental issue in the stochastic extragradient method by providing a new sampling strategy that is motivated by approximating implicit updates.

Paper
Add Code

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop

no code implementations • 24 Jan 2019 • Dmitry Kovalev, Samuel Horvath, Peter Richtarik

A key structural element in both of these methods is the inclusion of an outer loop at the beginning of which a full pass over the training data is made in order to compute the exact gradient, which is then used to construct a variance-reduced estimator of the gradient.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.