$\texttt{DoStoVoQ}$: Doubly Stochastic Voronoi Vector Quantization SGD for Federated Learning

NeurIPS 2021 · Louis Leconte, Aymeric Dieuleveut, Edouard Oyallon, Eric Moulines, Gilles Pages ·

The growing size of models and datasets have made distributed implementation of stochastic gradient descent (SGD) an active field of research. However the high bandwidth cost of communicating gradient updates between nodes remains a bottleneck; lossy compression is a way to alleviate this problem. We propose a new $\textit{unbiased}$ Vector Quantizer (VQ), named $\texttt{StoVoQ}$, to perform gradient quantization. This approach relies on introducing randomness within the quantization process, that is based on the use of unitarily invariant random codebooks and on a straightforward bias compensation method. The distortion of $\texttt{StoVoQ}$ significantly improves upon existing quantization algorithms. Next, we explain how to combine this quantization scheme within a Federated Learning framework for complex high-dimensional model (dimension $>10^6$), introducing $\texttt{DoStoVoQ}$. We provide theoretical guarantees on the quadratic error and (absence of) bias of the compressor, that allow to leverage strong theoretical results of convergence, e.g., with heterogeneous workers or variance reduction. Finally, we show that training on convex and non-convex deep learning problems, our method leads to significant reduction of bandwidth use while preserving model accuracy.

PDF Abstract