no code implementations • 17 Sep 2019 • Qinyi Luo, Jiaao He, Youwei Zhuo, Xuehai Qian
Is it possible to get the best of both worlds - designing a distributed training method that has both high performance as All-Reduce in homogeneous environment and good heterogeneity tolerance as AD-PSGD?
no code implementations • 4 Feb 2019 • Qinyi Luo, JinKun Lin, Youwei Zhuo, Xuehai Qian
Based on a unique characteristic of decentralized training that we have identified, the iteration gap, we propose a queue-based synchronization mechanism that can efficiently implement backup workers and bounded staleness in the decentralized setting.