Model Compression
342 papers with code • 2 benchmarks • 4 datasets
Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.
Libraries
Use these libraries to find Model Compression models and implementationsMost implemented papers
Data-Free Knowledge Distillation for Deep Neural Networks
Recent advances in model compression have provided procedures for compressing large neural networks to a fraction of their original size while retaining most if not all of their accuracy.
Weightless: Lossy Weight Encoding For Deep Neural Network Compression
This results in up to a 1. 51x improvement over the state-of-the-art.
Paraphrasing Complex Network: Network Compression via Factor Transfer
Among the model compression methods, a method called knowledge transfer is to train a student network with a stronger teacher network.
Dynamic Channel Pruning: Feature Boosting and Suppression
Making deep convolutional neural networks more accurate typically comes at the cost of increased computational and memory resources.
GASL: Guided Attention for Sparsity Learning in Deep Neural Networks
The main goal of network pruning is imposing sparsity on the neural network by increasing the number of parameters with zero value in order to reduce the architecture size and the computational speedup.
Model Compression with Adversarial Robustness: A Unified Optimization Framework
Deep model compression has been extensively studied, and state-of-the-art methods can now achieve high compression ratios with minimal accuracy loss.
Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM
A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.
Light Multi-segment Activation for Model Compression
Inspired by the nature of the expressiveness ability in Neural Networks, we propose to use multi-segment activation, which can significantly improve the expressiveness ability with very little cost, in the compact student model.
Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems
Offloading the execution of complex Deep Neural Networks (DNNs) models to compute-capable devices at the network edge, that is, edge servers, can significantly reduce capture-to-output delay.
How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance.