TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Knowledge Distillation	ImageNet	ADLIK-MO-P25(T:SeNet154, ResNet152b S:ResNet-50-prune25%)	Top-1 accuracy %	78.79	# 7
Knowledge Distillation	ImageNet	ADLIK-MO-P25(T:SeNet154, ResNet152b S:ResNet-50-prune25%)	model size	56.9M	# 3
Knowledge Distillation	ImageNet	ADLIK-MO-P25(T:SeNet154, ResNet152b S:ResNet-50-prune25%)	CRD training setting	✘	# 1
Knowledge Distillation	ImageNet	ADLIK-MO-P50(T:SeNet154, ResNet152b S:ResNet-50-half)	Top-1 accuracy %	76.376	# 13
Knowledge Distillation	ImageNet	ADLIK-MO-P50(T:SeNet154, ResNet152b S:ResNet-50-half)	model size	27M	# 6
Knowledge Distillation	ImageNet	ADLIK-MO-P50(T:SeNet154, ResNet152b S:ResNet-50-half)	CRD training setting	✘	# 1
Knowledge Distillation	ImageNet	ADLIK-MO-P375(T:SeNet154, ResNet152b S:ResNet-50-prune37.5)	Top-1 accuracy %	78.07	# 8
Knowledge Distillation	ImageNet	ADLIK-MO-P375(T:SeNet154, ResNet152b S:ResNet-50-prune37.5)	model size	40.5M	# 4
Knowledge Distillation	ImageNet	ADLIK-MO-P375(T:SeNet154, ResNet152b S:ResNet-50-prune37.5)	CRD training setting	✘	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/ensemble-knowledge-distillation-for-learning/knowledge-distillation-on-imagenet)](https://paperswithcode.com/sota/knowledge-distillation-on-imagenet?p=ensemble-knowledge-distillation-for-learning)`

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

17 Sep 2019 · Umar Asif, Jianbin Tang, Stefan Harrer ·

Ensemble models comprising of deep Convolutional Neural Networks (CNN) have shown significant improvements in model generalization but at the cost of large computation and memory requirements. In this paper, we present a framework for learning compact CNN models with improved classification performance and model generalization. For this, we propose a CNN architecture of a compact student model with parallel branches which are trained using ground truth labels and information from high capacity teacher networks in an ensemble learning fashion. Our framework provides two main benefits: i) Distilling knowledge from different teachers into the student network promotes heterogeneity in feature learning at different branches of the student network and enables the network to learn diverse solutions to the target problem. ii) Coupling the branches of the student network through ensembling encourages collaboration and improves the quality of the final predictions by reducing variance in the network outputs. Experiments on the well established CIFAR-10 and CIFAR-100 datasets show that our Ensemble Knowledge Distillation (EKD) improves classification accuracy and model generalization especially in situations with limited training data. Experiments also show that our EKD based compact networks outperform in terms of mean accuracy on the test datasets compared to state-of-the-art knowledge distillation based methods.

PDF Abstract

Code

Add Remove Mark official

Adlik/model_optimizer

softsys4ai/neural-distiller

Tasks

Add Remove

Ensemble Learning

General Classification

Knowledge Distillation

Datasets

ImageNet

Results from the Paper

Edit

Ranked #7 on Knowledge Distillation on ImageNet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Knowledge Distillation	ImageNet	ADLIK-MO-P25(T:SeNet154, ResNet152b S:ResNet-50-prune25%)	Top-1 accuracy %	78.79	# 7	Compare
			model size	56.9M	# 3	Compare
			CRD training setting	✘	# 1	Compare
Knowledge Distillation	ImageNet	ADLIK-MO-P50(T:SeNet154, ResNet152b S:ResNet-50-half)	Top-1 accuracy %	76.376	# 13	Compare
			model size	27M	# 6	Compare
			CRD training setting	✘	# 1	Compare
Knowledge Distillation	ImageNet	ADLIK-MO-P375(T:SeNet154, ResNet152b S:ResNet-50-prune37.5)	Top-1 accuracy %	78.07	# 8	Compare
			model size	40.5M	# 4	Compare
			CRD training setting	✘	# 1	Compare

Methods

Add Remove

Knowledge Distillation

Edit Social Preview

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove