RepVGG: Making VGG-style ConvNets Great Again

We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology. Such decoupling of the training-time and inference-time architecture is realized by a structural re-parameterization technique so that the model is named RepVGG. On ImageNet, RepVGG reaches over 80% top-1 accuracy, which is the first time for a plain model, to the best of our knowledge. On NVIDIA 1080Ti GPU, RepVGG models run 83% faster than ResNet-50 or 101% faster than ResNet-101 with higher accuracy and show favorable accuracy-speed trade-off compared to the state-of-the-art models like EfficientNet and RegNet. The code and trained models are available at https://github.com/megvii-model/RepVGG.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Segmentation Cityscapes val RepVGG-B2 mIoU 80.57% # 42
Image Classification ImageNet RepVGG-B2 Top 1 Accuracy 78.78% # 743
Number of params 80.31M # 805
GFLOPs 18.4 # 358
Image Classification ImageNet RepVGG-B2g4 Top 1 Accuracy 78.5% # 760
Number of params 55.77M # 745
GFLOPs 11.3 # 309

Methods