$\texttt{DoStoVoQ}$: Doubly Stochastic Voronoi Vector Quantization SGD for Federated Learning

The growing size of models and datasets have made distributed implementation of stochastic gradient descent (SGD) an active field of research. However the high bandwidth cost of communicating gradient updates between nodes remains a bottleneck; lossy compression is a way to alleviate this problem. We propose a new $\textit{unbiased}$ Vector Quantizer (VQ), named $\texttt{StoVoQ}$, to perform gradient quantization. This approach relies on introducing randomness within the quantization process, that is based on the use of unitarily invariant random codebooks and on a straightforward bias compensation method. The distortion of $\texttt{StoVoQ}$ significantly improves upon existing quantization algorithms. Next, we explain how to combine this quantization scheme within a Federated Learning framework for complex high-dimensional model (dimension $>10^6$), introducing $\texttt{DoStoVoQ}$. We provide theoretical guarantees on the quadratic error and (absence of) bias of the compressor, that allow to leverage strong theoretical results of convergence, e.g., with heterogeneous workers or variance reduction. Finally, we show that training on convex and non-convex deep learning problems, our method leads to significant reduction of bandwidth use while preserving model accuracy.

PDF Abstract


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here