Model Compression

171 papers with code • 0 benchmarks • 1 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Greatest papers with code

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

google-research/bert ICLR 2020

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training.

Fine-tuning Knowledge Distillation +3

What Do Compressed Deep Neural Networks Forget?

google-research/google-research 13 Nov 2019

However, this measure of performance conceals significant differences in how different classes and images are impacted by model compression techniques.

Fairness Interpretability Techniques for Deep Learning +4

The State of Sparsity in Deep Neural Networks

google-research/google-research 25 Feb 2019

We rigorously evaluate three state-of-the-art techniques for inducing sparsity in deep neural networks on two large-scale learning tasks: Transformer trained on WMT 2014 English-to-German, and ResNet-50 trained on ImageNet.

Model Compression Sparse Learning

Training with Quantization Noise for Extreme Model Compression

pytorch/fairseq ICLR 2021

A standard solution is to train networks with Quantization Aware Training, where the weights are quantized during training and the gradients approximated with the Straight-Through Estimator.

Image Generation Model Compression

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

pytorch/vision 24 Feb 2016

(2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car.

Image Classification Model Compression

Model compression via distillation and quantization

NervanaSystems/distiller ICLR 2018

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning.

Model Compression Quantization

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

NervanaSystems/distiller ECCV 2018

Model compression is a critical technique to efficiently deploy neural network models on mobile devices which have limited computation resources and tight power budgets.

Model Compression Neural Architecture Search

BinaryBERT: Pushing the Limit of BERT Quantization

huawei-noah/Pretrained-Language-Model ACL 2021

In this paper, we propose BinaryBERT, which pushes BERT quantization to the limit by weight binarization.

Binarization Fine-tuning +2

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

FLHonker/Awesome-Knowledge-Distillation 13 Apr 2020

To achieve faster speeds and to handle the problems caused by the lack of data, knowledge distillation (KD) has been proposed to transfer information learned from one model to another.

Knowledge Distillation Model Compression +1