A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication

NeurIPS 2018 Peng JiangGagan Agrawal

The large communication overhead has imposed a bottleneck on the performance of distributed Stochastic Gradient Descent (SGD) for training deep neural networks. Previous works have demonstrated the potential of using gradient sparsification and quantization to reduce the communication cost... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper