Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,279

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,305

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Most implemented papers

Most implemented Social Latest No code

Data-Free Knowledge Distillation for Deep Neural Networks

huawei-noah/DAFL • • 19 Oct 2017

Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy.

Paper
Code

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

cambridge-mlg/miracle • • 13 Nov 2017

This results in up to a 1. 51x improvement over the state-of-the-art.

Paper
Code

Paraphrasing Complex Network: Network Compression via Factor Transfer

Jangho-Kim/Factor-Transfer-pytorch • • NeurIPS 2018

Among the model compression methods, a method called knowledge transfer is to train a student network with a stronger teacher network.

Paper
Code

Dynamic Channel Pruning: Feature Boosting and Suppression

deep-fry/mayo • • ICLR 2019

Making deep convolutional neural networks more accurate typically comes at the cost of increased computational and memory resources.

Paper
Code

GASL: Guided Attention for Sparsity Learning in Deep Neural Networks

astorfi/attention-guided-sparsity • • 7 Jan 2019

The main goal of network pruning is imposing sparsity on the neural network by increasing the number of parameters with zero value in order to reduce the architecture size and the computational speedup.

Paper
Code

Model Compression with Adversarial Robustness: A Unified Optimization Framework

shupenggui/ATMC • • NeurIPS 2019

Deep model compression has been extensively studied, and state-of-the-art methods can now achieve high compression ratios with minimal accuracy loss.

Paper
Code

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

yeshaokai/Robustness-Aware-Pruning-ADMM • • 23 Mar 2019

A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.

Paper
Code

Light Multi-segment Activation for Model Compression

LMA-NeurIPS19/LMA • • 16 Jul 2019

Inspired by the nature of the expressiveness ability in Neural Networks, we propose to use multi-segment activation, which can significantly improve the expressiveness ability with very little cost, in the compact student model.

Paper
Code

Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems

yoshitomo-matsubara/head-network-distillation • • 1 Oct 2019

Offloading the execution of complex Deep Neural Networks (DNNs) models to compute-capable devices at the network edge, that is, edge servers, can significantly reduce capture-to-output delay.

Paper
Code

How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?

arm-software/ml-restructurable-activation-networks • • CVPR 2021

In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance.

Paper
Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result