TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning

NeurIPS 2017 Wei WenCong XuFeng YanChunpeng WuYandan WangYiran ChenHai Li

High network communication cost for synchronizing gradients and parameters is the well-known bottleneck of distributed training. In this work, we propose TernGrad that uses ternary gradients to accelerate distributed deep learning in data parallelism... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.