Deep Residual Learning for Image Recognition

CVPR 2016  ·  Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun ·

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

PDF Abstract CVPR 2016 PDF CVPR 2016 Abstract
76,489
↳ Quickstart in
25,428
See all 466 implementations

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification cifar100 shreynet 1:1 Accuracy 45.98 # 1
Semantic Segmentation Cityscapes val Dilated-ResNet (Dilated-ResNet-101) mIoU 75.7 # 59
Object Detection COCO test-dev Faster R-CNN (box refinement, context, multi-scale testing) box mAP 34.9 # 219
Semantic Segmentation DADA-seg ResNet-101 mIoU 23.60 # 16
Semantic Segmentation DADA-seg ResNet-50 mIoU 18.96 # 24
Image Classification GasHisSDB ResNet-18 Accuracy 98.47 # 5
Precision 99.94 # 3
F1-Score 99.19 # 5
Image Classification GasHisSDB ResNet-50 Accuracy 98.56 # 4
Precision 99.94 # 3
F1-Score 99.24 # 4
Image-to-Image Translation GTAV-to-Cityscapes Labels ResNet101 65.1 mIoU 41.7 # 20
Image Classification ImageNet ResNet-101 Top 1 Accuracy 78.25% # 770
Number of params 40M # 672
GFLOPs 7.6 # 258
Image Classification ImageNet ResNet-152 Top 1 Accuracy 78.57% # 752
GFLOPs 11.3 # 309
Image Classification ImageNet ResNet-50 Top 1 Accuracy 75.3% # 873
Number of params 25M # 581
GFLOPs 3.8 # 186
Domain Generalization ImageNet-A ResNet-50 (300 Epochs) Top-1 accuracy % 4.2 # 37
Domain Generalization ImageNet-R ResNet-50 Top-1 Error Rate 63.9 # 36
Out-of-Distribution Generalization ImageNet-W ResNet-50 IN-W Gap -26.7 # 1
Carton Gap +40 # 1
Dynamic Facial Expression Recognition MAFW ResNet-18+LSTM WAR 39.38 # 13
UAR 28.08 # 12
Dynamic Facial Expression Recognition MAFW ResNet-18 WAR 36.65 # 14
UAR 25.58 # 13
Medical Image Classification NCT-CRC-HE-100K ResNet-50 Accuracy (%) 94.72 # 4
F1-Score 97.09 # 4
Precision 100.00 # 1
Specificity 99.34 # 4
Medical Image Classification NCT-CRC-HE-100K ResNet-18 Accuracy (%) 92.66 # 7
F1-Score 95.23 # 7
Precision 99.90 # 5
Specificity 99.08 # 7
Retinal OCT Disease Classification OCT2017 ResNet50-v1 Acc 99.3 # 5
Sensitivity 99.3 # 6
Domain Adaptation Office-31 ResNet-50 Average Accuracy 76.1 # 34
Unsupervised Domain Adaptation Office-Home ResNet-50 [cite:CVPR16DRL] Accuracy 59.9 # 8
Image Classification OmniBenchmark ResNet-50 Average Top-1 Accuracy 34.3 # 15
Image Classification OmniBenchmark ResNet-101 Average Top-1 Accuracy 37.4 # 10
Person Re-Identification SYSU-30k ResNet-50 (generalization) Rank-1 20.1 # 1
Pedestrian Attribute Recognition UAV-Human ResNet Gender 74.7 # 2
Hat 65.2 # 2
UCC 44.4 # 2
UCS 68.9 # 2
LCC 49.7 # 2
LCS 69.3 # 1
Backpack 63.5 # 2
Domain Generalization VizWiz-Classification ResNet-152 Accuracy - All Images 47.5 # 11
Accuracy - Corrupted Images 43.3 # 8
Accuracy - Clean Images 51.3 # 9
Domain Generalization VizWiz-Classification ResNet-50 Accuracy - All Images 42.9 # 20
Accuracy - Corrupted Images 37.1 # 19
Accuracy - Clean Images 47.7 # 19
Domain Generalization VizWiz-Classification ResNet-101 Accuracy - All Images 46.3 # 12
Accuracy - Corrupted Images 40.5 # 10
Accuracy - Clean Images 50.1 # 12
Multi-Label Image Classification VizWiz-Classification ResNet151 Accuracy 47.5 # 1
Classification XImageNet-12 ResNet 50 Robustness Score 0.8985 # 2

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Uses Extra
Training Data
Source Paper Compare
Object Detection COCO minival GFL (ResNet-50) box AP 44.5 # 117
AP50 63.0 # 54
AP75 48.3 # 38
Object Detection COCO minival Cascade Mask R-CNN (ResNet-50) box AP 46.3 # 100
AP50 64.3 # 40
AP75 50.5 # 29
Object Detection COCO minival ATSS (ResNet-50) box AP 43.5 # 125
AP50 61.9 # 62
AP75 47.0 # 47
Breast Tumour Classification PCam ResNet-34 (e) AUC 0.942 # 11
Breast Tumour Classification PCam ResNet-50 (e) AUC 0.948 # 10
Retinal OCT Disease Classification Srinivasan2014 ResNet50-v1 Acc 94.92 # 10
Synthetic-to-Real Translation Syn2Real-C No Adaptation Accuracy 52.4 # 6
Crowd Counting UCF-QNRF Resnet101 MAE 190 # 9
Semantic Object Interaction Classification VLOG R50 MAP 40.5 # 2

Methods