1 code implementation • 20 Nov 2019 • Ruobing Han, James Demmel, Yang You
Our experimental results show that for many applications, APS can train state-of-the-art models by 8-bit gradients with no or only a tiny accuracy loss (<0. 05%).
1 code implementation • 19 Feb 2019 • Peng Sun, Wansen Feng, Ruobing Han, Shengen Yan, Yonggang Wen
To address this problem, we propose a communication backend named GradientFlow for distributed DNN training, and employ a set of network optimization techniques.
Distributed, Parallel, and Cluster Computing