GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

16 Nov 2018Yanping HuangYoulong ChengAnkur BapnaOrhan FiratMia Xu ChenDehao ChenHyoukJoong LeeJiquan NgiamQuoc V. LeYonghui WuZhifeng Chen

Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has required developing special algorithms or infrastructure... (read more)

PDF Abstract

Evaluation results from the paper


 SOTA for Image Classification on CIFAR-10 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Compare
Fine-Grained Image Classification Birdsnap GPIPE Accuracy 83.6% # 2
Image Classification CIFAR-10 GPIPE + transfer learning Percentage correct 99 # 1
Image Classification CIFAR-10 GPIPE + transfer learning Percentage error 1 # 1
Image Classification CIFAR-100 GPIPE Percentage correct 91.3 # 2
Image Classification ImageNet GPIPE Top 1 Accuracy 84.3% # 5
Image Classification ImageNet GPIPE Top 5 Accuracy 97% # 6
Image Classification ImageNet GPIPE Number of params 557M # 1