Asynchronous Stochastic Gradient Descent with Delay Compensation

ICML 2017 Shuxin ZhengQi MengTaifeng WangWei ChenNenghai YuZhi-Ming MaTie-Yan Liu

With the fast development of deep learning, it has become common to learn big neural networks using massive training data. Asynchronous Stochastic Gradient Descent (ASGD) is widely adopted to fulfill this task for its efficiency, which is, however, known to suffer from the problem of delayed gradients... (read more)

PDF Abstract

Evaluation Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.