Faster Distributed Synchronous SGD with Weak Synchronization

ICLR 2018 Cong XieOluwasanmi O. KoyejoIndranil Gupta

Distributed training of deep learning is widely conducted with large neural networks and large datasets. Besides asynchronous stochastic gradient descent~(SGD), synchronous SGD is a reasonable alternative with better convergence guarantees... (read more)

PDF Abstract


No code implementations yet. Submit your code now


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper