TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Instance Segmentation	COCO minival	RetinaNet (SpineNet-190, 1536x1536)	mask AP	46.1	# 43
Object Detection	COCO minival	RetinaNet (SpineNet-190, 1536x1536)	box AP	52.2	# 65
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	box mAP	48.6	# 89
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	AP50	68.4	# 49
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	AP75	52.5	# 56
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	APS	32	# 35
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	APM	52.3	# 43
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	APL	62	# 43
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	box mAP	50.7	# 70
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	AP50	70.4	# 30
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	AP75	54.9	# 41
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	APS	33.6	# 27
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	APM	53.9	# 33
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	APL	62.1	# 40
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	box mAP	52.1	# 59
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	AP50	71.8	# 22
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	AP75	56.5	# 32
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	APS	35.4	# 17
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	APM	55	# 20
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	APL	63.6	# 25
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	box mAP	41.5	# 167
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	AP50	60.5	# 116
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	AP75	44.6	# 118
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	APS	23.3	# 107
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	APM	45	# 103
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	APL	58	# 63
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	box mAP	44.3	# 142
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	AP50	63.8	# 88
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	AP75	47.6	# 88
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	APS	25.9	# 84
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	APM	47.7	# 72
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	APL	61.1	# 42
Instance Segmentation	COCO test-dev	Mask R-CNN (SpineNet-190, 1536x1536)	mask AP	46.1	# 36
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	box mAP	42.8	# 151
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	AP50	62.3	# 102
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	AP75	46.1	# 107
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	APS	23.7	# 102
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	APM	45.2	# 100
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	APL	57.3	# 69
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	box mAP	46.7	# 119
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	AP50	66.3	# 59
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	AP75	50.6	# 66
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	APS	29.1	# 54
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	APM	50.1	# 51
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	APL	61.7	# 37
Image Classification	ImageNet	SpineNet-143	Top 1 Accuracy	79%	# 727
Image Classification	ImageNet	SpineNet-143	Number of params	60.5M	# 770
Image Classification	ImageNet	SpineNet-143	GFLOPs	9.1	# 287
Image Classification	iNaturalist	SpineNet-143	Top 1 Accuracy	63.6%	# 9
Image Classification	iNaturalist	SpineNet-143	Top 5 Accuracy	84.8%	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/image-classification-on-inaturalist)](https://paperswithcode.com/sota/image-classification-on-inaturalist?p=spinenet-learning-scale-permuted-backbone-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=spinenet-learning-scale-permuted-backbone-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/instance-segmentation-on-coco-minival)](https://paperswithcode.com/sota/instance-segmentation-on-coco-minival?p=spinenet-learning-scale-permuted-backbone-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=spinenet-learning-scale-permuted-backbone-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=spinenet-learning-scale-permuted-backbone-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spinenet-learning-scale-permuted-backbone-for/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=spinenet-learning-scale-permuted-backbone-for)`

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

CVPR 2020 · Xianzhi Du, Tsung-Yi Lin, Pengchong Jin, Golnaz Ghiasi, Mingxing Tan, Yin Cui, Quoc V. Le, Xiaodan Song ·

Convolutional neural networks typically encode an input image into a series of intermediate features with decreasing resolutions. While this structure is suited to classification tasks, it does not perform well for tasks requiring simultaneous recognition and localization (e.g., object detection). The encoder-decoder architectures are proposed to resolve this by applying a decoder network onto a backbone model designed for classification tasks. In this paper, we argue encoder-decoder architecture is ineffective in generating strong multi-scale features because of the scale-decreased backbone. We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search. Using similar building blocks, SpineNet models outperform ResNet-FPN models by ~3% AP at various scales while using 10-20% fewer FLOPs. In particular, SpineNet-190 achieves 52.5% AP with a MaskR-CNN detector and achieves 52.1% AP with a RetinaNet detector on COCO for a single model without test-time augmentation, significantly outperforms prior art of detectors. SpineNet can transfer to classification tasks, achieving 5% top-1 accuracy improvement on a challenging iNaturalist fine-grained dataset. Code is at: https://github.com/tensorflow/tpu/tree/master/models/official/detection.

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Code

Add Remove Mark official

tensorflow/models official

73,120

tensorflow/tpu official

5,176

tensorflow/tpu

5,176

lucifer443/SpineNet-Pytorch

yan-roo/SpineNet-Pytorch

See all 13 implementations

Tasks

Add Remove

General Classification

Image Classification

Instance Segmentation

Neural Architecture Search

object-detection

Object Detection

Point Cloud Registration

Real-Time Object Detection

Datasets

ImageNet

MS COCO

iNaturalist

Results from the Paper

Edit

Ranked #9 on Image Classification on iNaturalist

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Instance Segmentation	COCO minival	RetinaNet (SpineNet-190, 1536x1536)	mask AP	46.1	# 43	Compare
Object Detection	COCO minival	RetinaNet (SpineNet-190, 1536x1536)	box AP	52.2	# 65	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-96, 1024x1024)	box mAP	48.6	# 89	Compare
			AP50	68.4	# 49	Compare
			AP75	52.5	# 56	Compare
			APS	32	# 35	Compare
			APM	52.3	# 43	Compare
			APL	62	# 43	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-143, 1280x1280)	box mAP	50.7	# 70	Compare
			AP50	70.4	# 30	Compare
			AP75	54.9	# 41	Compare
			APS	33.6	# 27	Compare
			APM	53.9	# 33	Compare
			APL	62.1	# 40	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-190, 1280x1280)	box mAP	52.1	# 59	Compare
			AP50	71.8	# 22	Compare
			AP75	56.5	# 32	Compare
			APS	35.4	# 17	Compare
			APM	55	# 20	Compare
			APL	63.6	# 25	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-49S, 640x640)	box mAP	41.5	# 167	Compare
			AP50	60.5	# 116	Compare
			AP75	44.6	# 118	Compare
			APS	23.3	# 107	Compare
			APM	45	# 103	Compare
			APL	58	# 63	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 640x640)	box mAP	44.3	# 142	Compare
			AP50	63.8	# 88	Compare
			AP75	47.6	# 88	Compare
			APS	25.9	# 84	Compare
			APM	47.7	# 72	Compare
			APL	61.1	# 42	Compare
Instance Segmentation	COCO test-dev	Mask R-CNN (SpineNet-190, 1536x1536)	mask AP	46.1	# 36	Compare
Object Detection	COCO test-dev	SpineNet-49 (640, RetinaNet, single-scale)	box mAP	42.8	# 151	Compare
			AP50	62.3	# 102	Compare
			AP75	46.1	# 107	Compare
			APS	23.7	# 102	Compare
			APM	45.2	# 100	Compare
			APL	57.3	# 69	Compare
Object Detection	COCO test-dev	RetinaNet (SpineNet-49, 896x896)	box mAP	46.7	# 119	Compare
			AP50	66.3	# 59	Compare
			AP75	50.6	# 66	Compare
			APS	29.1	# 54	Compare
			APM	50.1	# 51	Compare
			APL	61.7	# 37	Compare
Image Classification	ImageNet	SpineNet-143	Top 1 Accuracy	79%	# 727	Compare
			Number of params	60.5M	# 770	Compare
			GFLOPs	9.1	# 287	Compare
Image Classification	iNaturalist	SpineNet-143	Top 1 Accuracy	63.6%	# 9	Compare
Image Classification	iNaturalist	SpineNet-143	Top 5 Accuracy	84.8%	# 2	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Cosine Annealing • Dense Connections • Entropy Regularization • Focal Loss • FPN • Global Average Pooling • Kaiming Initialization • Linear Warmup With Cosine Annealing • LSTM • Mask R-CNN • Max Pooling • NAS-FPN • Neural Architecture Search • PPO • Random Horizontal Flip • Random Resized Crop • ReLU • Residual Block • Residual Connection • ResNet • RetinaNet • RoIAlign • RPN • SGD with Momentum • Sigmoid Activation • Softmax • SpineNet • Stochastic Depth • Swish • Tanh Activation • Weight Decay

Edit Social Preview

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove