TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	PyramidNet-272, S=4	Percentage correct	98.71	# 26
Image Classification	CIFAR-10	PyramidNet-272, S=4	PARAMS	32.6M	# 219
Image Classification	CIFAR-10	Shake-Shake 26 2x96d, S=4	Percentage correct	98.31	# 39
Image Classification	CIFAR-10	Shake-Shake 26 2x96d, S=4	PARAMS	26.3M	# 217
Image Classification	CIFAR-10	WRN-28-10, S=4	Percentage correct	98.32	# 38
Image Classification	CIFAR-10	WRN-28-10, S=4	PARAMS	36.7M	# 227
Image Classification	CIFAR-10	WRN-40-10, S=4	Percentage correct	98.38	# 37
Image Classification	CIFAR-10	WRN-40-10, S=4	PARAMS	55.9M	# 231
Image Classification	CIFAR-100	DenseNet-BC-190, S=4	Percentage correct	87.44	# 46
Image Classification	CIFAR-100	DenseNet-BC-190, S=4	PARAMS	26.3M	# 193
Image Classification	CIFAR-100	PyramidNet-272, S=4	Percentage correct	89.46	# 31
Image Classification	CIFAR-100	PyramidNet-272, S=4	PARAMS	32.8M	# 194
Image Classification	CIFAR-100	WRN-28-10, S=4	Percentage correct	85.74	# 59
Image Classification	CIFAR-100	WRN-40-10, S=4	Percentage correct	86.90	# 49
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(320px)	Top 1 Accuracy	83.6%	# 378
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(320px)	Number of params	98M	# 859
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(320px)	GFLOPs	38.2	# 410
Image Classification	ImageNet	ResNeXt-101, 64x4d, S=2(224px)	Top 1 Accuracy	82.13%	# 523
Image Classification	ImageNet	ResNeXt-101, 64x4d, S=2(224px)	Number of params	88.6M	# 842
Image Classification	ImageNet	ResNeXt-101, 64x4d, S=2(224px)	GFLOPs	18.8	# 361
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	Top 1 Accuracy	83.34%	# 402
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	Number of params	98M	# 859
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	Hardware Burden	None	# 1
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	Operations per network pass	None	# 1
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	GFLOPs	61.1	# 433

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/splitnet-divide-and-co-training/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=splitnet-divide-and-co-training)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/splitnet-divide-and-co-training/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=splitnet-divide-and-co-training)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/splitnet-divide-and-co-training/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=splitnet-divide-and-co-training)`

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

30 Nov 2020 · Shuai Zhao, Liguang Zhou, Wenxiao Wang, Deng Cai, Tin Lun Lam, Yangsheng Xu ·

The width of a neural network matters since increasing the width will necessarily increase the model capacity. However, the performance of a network does not improve linearly with the width and soon gets saturated. In this case, we argue that increasing the number of networks (ensemble) can achieve better accuracy-efficiency trade-offs than purely increasing the width. To prove it, one large network is divided into several small ones regarding its parameters and regularization components. Each of these small networks has a fraction of the original one's parameters. We then train these small networks together and make them see various views of the same data to increase their diversity. During this co-training process, networks can also learn from each other. As a result, small networks can achieve better ensemble performance than the large one with few or no extra parameters or FLOPs, \ie, achieving better accuracy-efficiency trade-offs. Small networks can also achieve faster inference speed than the large one by concurrent running. All of the above shows that the number of networks is a new dimension of model scaling. We validate our argument with 8 different neural architectures on common benchmarks through extensive experiments. The code is available at \url{https://github.com/FreeformRobotics/Divide-and-Co-training}.

PDF Abstract

Code

Add Remove Mark official

freeformrobotics/divide-and-co-trai… official

101

mzhaoshuai/Divide-and-Co-training official

101

Tasks

Add Remove

Image Classification

Datasets

CIFAR-10

ImageNet

MS COCO

CIFAR-100

ssd

Results from the Paper

Add Remove

Ranked #26 on Image Classification on CIFAR-10

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	PyramidNet-272, S=4	Percentage correct	98.71	# 26	Compare
Image Classification	CIFAR-10	PyramidNet-272, S=4	PARAMS	32.6M	# 219	Compare
Image Classification	CIFAR-10	Shake-Shake 26 2x96d, S=4	Percentage correct	98.31	# 39	Compare
Image Classification	CIFAR-10	Shake-Shake 26 2x96d, S=4	PARAMS	26.3M	# 217	Compare
Image Classification	CIFAR-10	WRN-28-10, S=4	Percentage correct	98.32	# 38	Compare
Image Classification	CIFAR-10	WRN-28-10, S=4	PARAMS	36.7M	# 227	Compare
Image Classification	CIFAR-10	WRN-40-10, S=4	Percentage correct	98.38	# 37	Compare
Image Classification	CIFAR-10	WRN-40-10, S=4	PARAMS	55.9M	# 231	Compare
Image Classification	CIFAR-100	DenseNet-BC-190, S=4	Percentage correct	87.44	# 46	Compare
Image Classification	CIFAR-100	DenseNet-BC-190, S=4	PARAMS	26.3M	# 193	Compare
Image Classification	CIFAR-100	PyramidNet-272, S=4	Percentage correct	89.46	# 31	Compare
Image Classification	CIFAR-100	PyramidNet-272, S=4	PARAMS	32.8M	# 194	Compare
Image Classification	CIFAR-100	WRN-28-10, S=4	Percentage correct	85.74	# 59	Compare
Image Classification	CIFAR-100	WRN-40-10, S=4	Percentage correct	86.90	# 49	Compare
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(320px)	Top 1 Accuracy	83.6%	# 378	Compare
			Number of params	98M	# 859	Compare
			GFLOPs	38.2	# 410	Compare
Image Classification	ImageNet	ResNeXt-101, 64x4d, S=2(224px)	Top 1 Accuracy	82.13%	# 523	Compare
			Number of params	88.6M	# 842	Compare
			GFLOPs	18.8	# 361	Compare
Image Classification	ImageNet	SE-ResNeXt-101, 64x4d, S=2(416px)	Top 1 Accuracy	83.34%	# 402	Compare
			Number of params	98M	# 859	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	None	# 1	Compare
			GFLOPs	61.1	# 433	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Towards Better Accuracy-efficiency Trade-offs: Divide and Co-training

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove