no code implementations • 29 Sep 2021 • Pengcheng Li, Yixin Guo, Yawen Zhang, Qinggang Zhou
Mini-batch Stochastic Gradient Descent (SGD) requires workers to halt forward/backward propagations, to wait for gradients synchronized among all workers before the next batch of tasks.
1 code implementation • 30 Mar 2021 • Chengxi Ye, Xiong Zhou, Tristan McKinney, Yanfeng Liu, Qinggang Zhou, Fedor Zhdanov
Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks.
no code implementations • 31 May 2020 • Qinggang Zhou, Yawen Zhang, Pengcheng Li, Xiaoyong Liu, Jun Yang, Runsheng Wang, Ru Huang
The state-of-the-art deep learning algorithms rely on distributed training systems to tackle the increasing sizes of models and training data sets.