Heterogeneity-Aware Asynchronous Decentralized Training

no code implementations17 Sep 2019 Qinyi Luo, Jiaao He, Youwei Zhuo, Xuehai Qian

Is it possible to get the best of both worlds - designing a distributed training method that has both high performance as All-Reduce in homogeneous environment and good heterogeneity tolerance as AD-PSGD?


Hop: Heterogeneity-Aware Decentralized Training

no code implementations4 Feb 2019 Qinyi Luo, JinKun Lin, Youwei Zhuo, Xuehai Qian

Based on a unique characteristic of decentralized training that we have identified, the iteration gap, we propose a queue-based synchronization mechanism that can efficiently implement backup workers and bounded staleness in the decentralized setting.

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

no code implementations7 Jan 2019 Linghao Song, Jiachen Mao, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen

In this paper, inspired by recent work in machine learning systems, we propose a solution HyPar to determine layer-wise parallelism for deep neural network training with an array of DNN accelerators.

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

no code implementations12 Dec 2018 Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang

It is a challenging task to have real-time, efficient, and accurate hardware RNN implementations because of the high sensitivity to imprecision accumulation and the requirement of special activation function implementations.

GraphR: Accelerating Graph Processing Using ReRAM

no code implementations21 Aug 2017 Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Li, Yiran Chen

GRAPHR gains a speedup of 1. 16x to 4. 12x, and is 3. 67x to 10. 96x more energy efficiency compared to PIM-based architecture.

