Double Quantization for Communication-Efficient Distributed Optimization

NeurIPS 2019 · Yue Yu, Jiaxiang Wu, Longbo Huang ·

Modern distributed training of machine learning models suffers from high communication overhead for synchronizing stochastic gradients and model parameters. In this paper, to reduce the communication complexity, we propose \emph{double quantization}, a general scheme for quantizing both model parameters and gradients. Three communication-efficient algorithms are proposed under this general scheme. Specifically, (i) we propose a low-precision algorithm AsyLPG with asynchronous parallelism, (ii) we explore integrating gradient sparsification with double quantization and develop Sparse-AsyLPG, (iii) we show that double quantization can also be accelerated by momentum technique and design accelerated AsyLPG. We establish rigorous performance guarantees for the algorithms, and conduct experiments on a multi-server test-bed to demonstrate that our algorithms can effectively save transmitted bits without performance degradation.