Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,260

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,302

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Most implemented papers

Most implemented Social Latest No code

Training with Quantization Noise for Extreme Model Compression

pytorch/fairseq • • ICLR 2021

A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.

Paper
Code

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

microsoft/NeuralSpeech • • 8 Feb 2021

Text to speech (TTS) has been broadly used to synthesize natural and intelligible speech in different scenarios.

Paper
Code

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning

eparisotto/ActorMimic • • 19 Nov 2015

The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent.

Paper
Code

MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Face Images

cuguilke/microexpnet • • 19 Nov 2017

On the other hand, KD is proved to be useful for model compression for the FER problem, and we discovered that its effects gets more and more significant with the decreasing model size.

Paper
Code

Patient Knowledge Distillation for BERT Model Compression

intersun/PKD-for-BERT-Model-Compression • • IJCNLP 2019

Pre-trained language models such as BERT have proven to be highly effective for natural language processing (NLP) tasks.

Paper
Code

Contrastive Representation Distillation

HobbitLong/RepDistiller • • ICLR 2020

We demonstrate that this objective ignores important structural knowledge of the teacher network.

Paper
Code

Data-Free Adversarial Distillation

VainF/Data-Free-Adversarial-Distillation • • 23 Dec 2019

Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer.

Paper
Code

ZeroQ: A Novel Zero Shot Quantization Framework

amirgholami/ZeroQ • • CVPR 2020

Importantly, ZeroQ has a very low computational overhead, and it can finish the entire quantization process in less than 30s (0. 5\% of one epoch training time of ResNet50 on ImageNet).

Paper
Code

Sharpness-aware Quantization for Deep Neural Networks

ziplab/saq • • 24 Nov 2021

However, the abrupt changes in quantized weights during training often lead to severe loss fluctuations and result in a sharp loss landscape, making the gradients unstable and thus degrading the performance.

Paper
Code

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale

microsoft/DeepSpeed • • 14 Jan 2022

As the training of giant dense models hits the boundary on the availability and capability of the hardware resources today, Mixture-of-Experts (MoE) models become one of the most promising model architectures due to their significant training cost reduction compared to a quality-equivalent dense model.

Paper
Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result