Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Most implemented papers

Data-Free Knowledge Distillation for Deep Neural Networks

huawei-noah/DAFL 19 Oct 2017

Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy.

Weightless: Lossy Weight Encoding For Deep Neural Network Compression

cambridge-mlg/miracle 13 Nov 2017

This results in up to a 1. 51x improvement over the state-of-the-art.

Paraphrasing Complex Network: Network Compression via Factor Transfer

Jangho-Kim/Factor-Transfer-pytorch NeurIPS 2018

Among the model compression methods, a method called knowledge transfer is to train a student network with a stronger teacher network.

Dynamic Channel Pruning: Feature Boosting and Suppression

deep-fry/mayo ICLR 2019

Making deep convolutional neural networks more accurate typically comes at the cost of increased computational and memory resources.

GASL: Guided Attention for Sparsity Learning in Deep Neural Networks

astorfi/attention-guided-sparsity 7 Jan 2019

The main goal of network pruning is imposing sparsity on the neural network by increasing the number of parameters with zero value in order to reduce the architecture size and the computational speedup.

Model Compression with Adversarial Robustness: A Unified Optimization Framework

shupenggui/ATMC NeurIPS 2019

Deep model compression has been extensively studied, and state-of-the-art methods can now achieve high compression ratios with minimal accuracy loss.

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

yeshaokai/Robustness-Aware-Pruning-ADMM 23 Mar 2019

A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.

Light Multi-segment Activation for Model Compression

LMA-NeurIPS19/LMA 16 Jul 2019

Inspired by the nature of the expressiveness ability in Neural Networks, we propose to use multi-segment activation, which can significantly improve the expressiveness ability with very little cost, in the compact student model.

Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems

yoshitomo-matsubara/head-network-distillation 1 Oct 2019

Offloading the execution of complex Deep Neural Networks (DNNs) models to compute-capable devices at the network edge, that is, edge servers, can significantly reduce capture-to-output delay.

How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?

arm-software/ml-restructurable-activation-networks CVPR 2021

In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance.