TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	COCO minival	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	box AP	54.2	# 56
Object Detection	COCO test-dev	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	box mAP	54.3	# 51
Object Detection	COCO test-dev	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	Hardware Burden	None	# 1
Object Detection	COCO test-dev	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	Operations per network pass	None	# 1
Semantic Segmentation	PASCAL VOC 2012 val	EfficientNet-L2+NAS-FPN (single scale test, with self-training)	mIoU	90.0%	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rethinking-pre-training-and-self-training/semantic-segmentation-on-pascal-voc-2012-val)](https://paperswithcode.com/sota/semantic-segmentation-on-pascal-voc-2012-val?p=rethinking-pre-training-and-self-training)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rethinking-pre-training-and-self-training/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=rethinking-pre-training-and-self-training)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/rethinking-pre-training-and-self-training/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=rethinking-pre-training-and-self-training)`

Rethinking Pre-training and Self-training

NeurIPS 2020 · Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin D. Cubuk, Quoc V. Le ·

Pre-training is a dominant paradigm in computer vision. For example, supervised ImageNet pre-training is commonly used to initialize the backbones of object detection and segmentation models. He et al., however, show a surprising result that ImageNet pre-training has limited impact on COCO object detection. Here we investigate self-training as another method to utilize additional data on the same setup and contrast it against ImageNet pre-training. Our study reveals the generality and flexibility of self-training with three additional insights: 1) stronger data augmentation and more labeled data further diminish the value of pre-training, 2) unlike pre-training, self-training is always helpful when using stronger data augmentation, in both low-data and high-data regimes, and 3) in the case that pre-training is helpful, self-training improves upon pre-training. For example, on the COCO object detection dataset, pre-training benefits when we use one fifth of the labeled data, and hurts accuracy when we use all labeled data. Self-training, on the other hand, shows positive improvements from +1.3 to +3.4AP across all dataset sizes. In other words, self-training works well exactly on the same setup that pre-training does not work (using ImageNet to help COCO). On the PASCAL segmentation dataset, which is a much smaller dataset than COCO, though pre-training does help significantly, self-training improves upon the pre-trained model. On COCO object detection, we achieve 54.3AP, an improvement of +1.5AP over the strongest SpineNet model. On PASCAL segmentation, we achieve 90.5 mIOU, an improvement of +1.5% mIOU over the previous state-of-the-art result by DeepLabv3+.

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract

Code

Add Remove Mark official

tensorflow/tpu

5,183

stanleyjzheng/PyData

Tasks

Add Remove

Data Augmentation

Object

object-detection

Object Detection

Segmentation

Semantic Segmentation

Datasets

ImageNet

MS COCO

Results from the Paper

Edit

Ranked #1 on Semantic Segmentation on PASCAL VOC 2012 val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	COCO minival	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	box AP	54.2	# 56	Compare
Object Detection	COCO test-dev	SpineNet-190 (1280, with Self-training on OpenImages, single-scale)	box mAP	54.3	# 51	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	None	# 1	Compare
Semantic Segmentation	PASCAL VOC 2012 val	EfficientNet-L2+NAS-FPN (single scale test, with self-training)	mIoU	90.0%	# 1	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Entropy Regularization • Global Average Pooling • LSTM • NAS-FPN • Neural Architecture Search • PPO • ReLU • Residual Block • Residual Connection • Sigmoid Activation • Softmax • SpineNet • Swish • Tanh Activation

Edit Social Preview

Rethinking Pre-training and Self-training

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove