GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism

Scaling up deep neural network capacity has been known as an effective approach to improving model quality for several different machine learning tasks. In many cases, increasing model capacity beyond the memory limit of a single accelerator has required developing special algorithms or infrastructure... (read more)

PDF Abstract NeurIPS 2019 PDF NeurIPS 2019 Abstract

Datasets


Results from the Paper


Ranked #3 on Fine-Grained Image Classification on Birdsnap (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Fine-Grained Image Classification Birdsnap GPIPE Accuracy 83.6% # 3
Image Classification CIFAR-10 GPIPE + transfer learning Percentage correct 99 # 7
Percentage error 1 # 6
Image Classification CIFAR-100 GPIPE Percentage correct 91.3 # 6
Image Classification ImageNet GPIPE Top 1 Accuracy 84.4% # 32
Top 5 Accuracy 97% # 18
Fine-Grained Image Classification Stanford Cars GPipe Accuracy 94.6% # 12

Methods used in the Paper


METHOD TYPE
Spatially Separable Convolution
Convolutions
Max Pooling
Pooling Operations
Convolution
Convolutions
Average Pooling
Pooling Operations
AmoebaNet
Convolutional Neural Networks
Residual Connection
Skip Connections
BPE
Subword Segmentation
Dense Connections
Feedforward Networks
Label Smoothing
Regularization
ReLU
Activation Functions
Adam
Stochastic Optimization
Softmax
Output Functions
Dropout
Regularization
Multi-Head Attention
Attention Modules
Layer Normalization
Normalization
Scaled Dot-Product Attention
Attention Mechanisms
Transformer
Transformers