Search Results for author: Hyungjun Oh

Found 1 papers, 0 papers with code

Scheduling Optimization Techniques for Neural Network Training

no code implementations3 Oct 2021 Hyungjun Oh, Hyeongju Kim, Jiwon Seo

In data-parallel training, we reorder the gradient computations to maximize the overlapping of computation and parameter communication; in pipeline-parallel training, we prioritize critical gradient computations to reduce the pipeline stalls. We evaluate our optimizations with twelve neural networks including a light-weight computer vision model (MobileNet) and largeNLP models (BERT and GPT-3) with up to forty eight V100 GPUs. Our scheduling algorithms effectively improve the performance of single-GPU training as well as data- and pipeline-parallel training. Compared to the respective state of the art training systems, the throughput is substantially improved for single-GPU, data-parallel, and pipeline-parallel training.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.