Network Pruning
232 papers with code • 5 benchmarks • 5 datasets
Network Pruning is a popular approach to reduce a heavy network to obtain a light-weight form by removing redundancy in the heavy network. In this approach, a complex over-parameterized network is first trained, then pruned based on come criterions, and finally fine-tuned to achieve comparable performance with reduced parameters.
Source: Ensemble Knowledge Distillation for Learning Improved and Efficient Networks
Libraries
Use these libraries to find Network Pruning models and implementationsMost implemented papers
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
(2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car.
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations.
Pruning Filters for Efficient ConvNets
However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks.
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
SNIP: Single-shot Network Pruning based on Connection Sensitivity
To achieve this, we introduce a saliency criterion based on connection sensitivity that identifies structurally important connections in the network for the given task.
Manifold Regularized Dynamic Network Pruning
Then, the manifold relationship between instances and the pruned sub-networks will be aligned in the training procedure.
A Simple and Effective Pruning Approach for Large Language Models
Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting.
A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
We first formulate the weight pruning problem of DNNs as a nonconvex optimization problem with combinatorial constraints specifying the sparsity requirements, and then adopt the ADMM framework for systematic weight pruning.
Network Pruning via Transformable Architecture Search
The maximum probability for the size in each distribution serves as the width and depth of the pruned network, whose parameters are learned by knowledge transfer, e. g., knowledge distillation, from the original networks.