Search Results for author: Yifei Cheng

Found 5 papers, 3 papers with code

DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training

1 code implementation28 Feb 2022 Joya Chen, Kai Xu, Yifei Cheng, Angela Yao

The bulk of memory is occupied by caching intermediate tensors for gradient computation in the backward pass.

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

no code implementations11 Jun 2020 Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu

Distributed parallel stochastic gradient descent algorithms are workhorses for large scale machine learning tasks.

Variance Reduced Local SGD with Lower Communication Complexity

1 code implementation30 Dec 2019 Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng

To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training.

Is Heuristic Sampling Necessary in Training Deep Object Detectors?

13 code implementations11 Sep 2019 Joya Chen, Dong Liu, Tong Xu, Shiwei Wu, Yifei Cheng, Enhong Chen

In this paper, we challenge the necessity of such hard/soft sampling methods for training accurate deep object detectors.

General Classification Instance Segmentation +1

Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent

no code implementations28 Jun 2019 Shuheng Shen, Linli Xu, Jingchang Liu, Xianfeng Liang, Yifei Cheng

Nevertheless, although distributed stochastic gradient descent (SGD) algorithms can achieve a linear iteration speedup, they are limited significantly in practice by the communication cost, making it difficult to achieve a linear time speedup.

Cannot find the paper you are looking for? You can Submit a new open access paper.