Network Pruning
212 papers with code • 5 benchmarks • 5 datasets
Network Pruning is a popular approach to reduce a heavy network to obtain a light-weight form by removing redundancy in the heavy network. In this approach, a complex over-parameterized network is first trained, then pruned based on come criterions, and finally fine-tuned to achieve comparable performance with reduced parameters.
Source: Ensemble Knowledge Distillation for Learning Improved and Efficient Networks
Libraries
Use these libraries to find Network Pruning models and implementationsMost implemented papers
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; however, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications.
SCOP: Scientific Control for Reliable Neural Network Pruning
To increase the reliability of the results, we prefer to have a more rigorous research design by including a scientific control group as an essential part to minimize the effect of all factors except the association between the filter and expected network output.
A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
We first formulate the weight pruning problem of DNNs as a nonconvex optimization problem with combinatorial constraints specifying the sparsity requirements, and then adopt the ADMM framework for systematic weight pruning.
Importance Estimation for Neural Network Pruning
On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0. 02% in the top-1 accuracy on ImageNet.
Picking Winning Tickets Before Training by Preserving Gradient Flow
Overparameterization has been shown to benefit both the optimization and generalization of neural networks, but large networks are resource hungry at both training and test time.
Similarity of Neural Networks with Gradients
A suitable similarity index for comparing learnt neural networks plays an important role in understanding the behaviour of the highly-nonlinear functions, and can provide insights on further theoretical analysis and empirical studies.
A Simple and Effective Pruning Approach for Large Language Models
Motivated by the recent observation of emergent large magnitude features in LLMs, our approach prunes weights with the smallest magnitudes multiplied by the corresponding input activations, on a per-output basis.
Building Efficient ConvNets using Redundant Feature Pruning
This paper presents an efficient technique to prune deep and/or wide convolutional neural network models by eliminating redundant features (or filters).
A Closer Look at Structured Pruning for Neural Network Compression
Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network.
Rethinking the Value of Network Pruning
Our observations are consistent for multiple network architectures, datasets, and tasks, which imply that: 1) training a large, over-parameterized model is often not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are typically not useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.