Accelerated Gradient Tracking over Time-varying Graphs for Decentralized Optimization

6 Apr 2021  ·  Huan Li, Zhouchen Lin ·

Decentralized optimization over time-varying graphs has been increasingly common in modern machine learning with massive data stored on millions of mobile devices, such as in federated learning. This paper revisits the widely used accelerated gradient tracking and extends it to time-varying graphs. We prove the $O((\frac{\gamma}{1-\sigma_{\gamma}})^2\sqrt{\frac{L}{\epsilon}})$ and $O((\frac{\gamma}{1-\sigma_{\gamma}})^{1.5}\sqrt{\frac{L}{\mu}}\log\frac{1}{\epsilon})$ complexities for the practical single loop accelerated gradient tracking over time-varying graphs when the problems are nonstrongly convex and strongly convex, respectively, where $\gamma$ and $\sigma_{\gamma}$ are two common constants charactering the network connectivity, $\epsilon$ is the desired precision, and $L$ and $\mu$ are the smoothness and strong convexity constants, respectively. Our complexities improve significantly over the ones of $O(\frac{1}{\epsilon^{5/7}})$ and $O((\frac{L}{\mu})^{5/7}\frac{1}{(1-\sigma)^{1.5}}\log\frac{1}{\epsilon})$, respectively, which were proved in the original literature only for static graphs, where $\frac{1}{1-\sigma}$ equals $\frac{\gamma}{1-\sigma_{\gamma}}$ when the network is time-invariant. When combining with a multiple consensus subroutine, the dependence on the network connectivity constants can be further improved to $O(1)$ and $O(\frac{\gamma}{1-\sigma_{\gamma}})$ for the computation and communication complexities, respectively. When the network is static, by employing the Chebyshev acceleration, our complexities exactly match the lower bounds without hiding any poly-logarithmic factor for both nonstrongly convex and strongly convex problems.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here