ResNeSt: Split-Attention Networks

It is well known that featuremap attention and multi-path representation are important for visual recognition. In this paper, we present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations... Our design results in a simple and unified computation block, which can be parameterized using only a few variables. Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification. In addition, ResNeSt has achieved superior transfer learning results on several public benchmarks serving as the backbone, and has been adopted by the winning entries of COCO-LVIS challenge. The source code for complete system and pretrained models are publicly available. read more

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Semantic Segmentation ADE20K ResNeSt-269 Validation mIoU 47.60 # 20
Semantic Segmentation ADE20K ResNeSt-101 Validation mIoU 46.91 # 24
Semantic Segmentation ADE20K ResNeSt-200 Validation mIoU 48.36 # 16
Semantic Segmentation ADE20K val ResNeSt-101 mIoU 46.91 # 28
Semantic Segmentation ADE20K val ResNeSt-200 mIoU 48.36 # 23
Semantic Segmentation ADE20K val ResNeSt-269 mIoU 47.60 # 25
Semantic Segmentation Cityscapes test ResNeSt200 Mean IoU (class) 83.3 # 7
Semantic Segmentation Cityscapes val ResNeSt-200 mIoU 82.7 # 10
Object Detection COCO minival ResNeSt-200 (single-scale) box AP 50.54 # 25
AP50 68.78 # 15
AP75 55.17 # 11
APM 54.2 # 13
APL 63.9 # 13
Instance Segmentation COCO minival ResNeSt-200 (multi-scale) mask AP 46.25 # 13
Object Detection COCO minival ResNeSt-200-DCN (single-scale) box AP 50.91 # 23
AP50 69.53 # 12
AP75 55.40 # 9
APS 32.67 # 12
APM 54.66 # 12
APL 65.83 # 11
Instance Segmentation COCO minival ResNeSt-200-DCN (single-scale) mask AP 44.5 # 18
Panoptic Segmentation COCO minival PanopticFPN+ResNeSt(single-scale) PQ 47.9 # 6
PQth 55.1 # 7
PQst 37.0 # 6
Object Detection COCO minival ResNeSt-200 (multi-scale) box AP 52.47 # 20
AP50 71.00 # 8
AP75 57.07 # 7
APS 36.80 # 8
APM 56.36 # 9
APL 66.29 # 10
Instance Segmentation COCO minival ResNeSt-200 (single-scale) mask AP 44.21 # 20
Instance Segmentation COCO minival ResNeSt-101 (single-scale) mask AP 41.56 # 25
Instance Segmentation COCO test-dev ResNeSt101 mask AP 43% # 18
Object Detection COCO test-dev ResNeSt-200 (multi-scale) box AP 53.3 # 32
AP50 72.0 # 16
AP75 58.0 # 22
APS 35.1 # 19
APM 56.2 # 20
APL 66.8 # 14
Hardware Burden None # 1
Operations per network pass None # 1
Instance Segmentation COCO test-dev ResNeSt-200 (multi-scale) AP50 70.2 # 6
AP75 51.5 # 6
APS 30.0 # 5
APM 49.6 # 5
APL 60.6 # 6
Image Classification ImageNet ResNeSt-200 Top 1 Accuracy 83.9% # 121
Image Classification ImageNet ResNeSt-101 Top 1 Accuracy 83.0% # 158
Image Classification ImageNet ResNeSt-269 Top 1 Accuracy 84.5% # 98
Hardware Burden None # 1
Operations per network pass None # 1
Semantic Segmentation PASCAL Context ResNeSt-101 mIoU 56.5 # 11
Semantic Segmentation PASCAL Context ResNeSt-200 mIoU 58.4 # 8
Semantic Segmentation PASCAL Context ResNeSt-269 mIoU 58.9 # 7

Methods