TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	COCO minival	Cascade RCNN-RS (SpineNet-143L, single scale)	box AP	53.6	# 57
Object Detection	COCO minival	Cascade RCNN-RS (SpineNet-143L, single scale)	APS	34.5	# 11
Object Detection	COCO minival	Cascade RCNN-RS (SpineNet-143L, single scale)	APM	56.7	# 10
Object Detection	COCO minival	Cascade RCNN-RS (SpineNet-143L, single scale)	APL	70.6	# 8
Object Detection	COCO minival	Cascade RCNN-RS (ResNet-200, single scale)	box AP	53.1	# 60
Object Detection	COCO minival	Cascade RCNN-RS (ResNet-200, single scale)	APS	33.9	# 12
Object Detection	COCO minival	Cascade RCNN-RS (ResNet-200, single scale)	APM	56.2	# 12
Object Detection	COCO minival	Cascade RCNN-RS (ResNet-200, single scale)	APL	70.3	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/simple-training-strategies-and-model-scaling/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=simple-training-strategies-and-model-scaling)`

Simple Training Strategies and Model Scaling for Object Detection

30 Jun 2021 · Xianzhi Du, Barret Zoph, Wei-Chih Hung, Tsung-Yi Lin ·

The speed-accuracy Pareto curve of object detection systems have advanced through a combination of better model architectures, training and inference methods. In this paper, we methodically evaluate a variety of these techniques to understand where most of the improvements in modern detection systems come from. We benchmark these improvements on the vanilla ResNet-FPN backbone with RetinaNet and RCNN detectors. The vanilla detectors are improved by 7.7% in accuracy while being 30% faster in speed. We further provide simple scaling strategies to generate family of models that form two Pareto curves, named RetinaNet-RS and Cascade RCNN-RS. These simple rescaled detectors explore the speed-accuracy trade-off between the one-stage RetinaNet detectors and two-stage RCNN detectors. Our largest Cascade RCNN-RS models achieve 52.9% AP with a ResNet152-FPN backbone and 53.6% with a SpineNet143L backbone. Finally, we show the ResNet architecture, with three minor architectural changes, outperforms EfficientNet as the backbone for object detection and instance segmentation systems.

PDF Abstract

Code

Add Remove Mark official

tensorflow/tpu official

5,176

Tasks

Add Remove

Instance Segmentation

Object

object-detection

Object Detection

Semantic Segmentation

Datasets

MS COCO

Waymo Open Dataset

Results from the Paper

Edit

Ranked #57 on Object Detection on COCO minival

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	COCO minival	Cascade RCNN-RS (SpineNet-143L, single scale)	box AP	53.6	# 57	Compare
			APS	34.5	# 11	Compare
			APM	56.7	# 10	Compare
			APL	70.6	# 8	Compare
Object Detection	COCO minival	Cascade RCNN-RS (ResNet-200, single scale)	box AP	53.1	# 60	Compare
			APS	33.9	# 12	Compare
			APM	56.2	# 12	Compare
			APL	70.3	# 9	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Dense Connections • Depthwise Convolution • Depthwise Separable Convolution • Dropout • Focal Loss • FPN • Global Average Pooling • Inverted Residual Block • Kaiming Initialization • Max Pooling • Pointwise Convolution • ReLU • Residual Block • Residual Connection • ResNet • ResNet-D • RetinaNet • RetinaNet-RS • RMSProp • Sigmoid Activation • SiLU • Squeeze-and-Excitation Block • Swish • Xavier Initialization

Edit Social Preview

Simple Training Strategies and Model Scaling for Object Detection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove