Network Pruning
213 papers with code • 5 benchmarks • 5 datasets
Network Pruning is a popular approach to reduce a heavy network to obtain a light-weight form by removing redundancy in the heavy network. In this approach, a complex over-parameterized network is first trained, then pruned based on come criterions, and finally fine-tuned to achieve comparable performance with reduced parameters.
Source: Ensemble Knowledge Distillation for Learning Improved and Efficient Networks
Libraries
Use these libraries to find Network Pruning models and implementationsLatest papers
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
On the other hand, even successful methods identify neurons that are not specific to a single memorized sequence.
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
GBLM-Pruner leverages the first-order term of the Taylor expansion, operating in a training-free manner by harnessing properly normalized gradients from a few calibration samples to determine the pruning metric, and substantially outperforms competitive counterparts like SparseGPT and Wanda in multiple benchmarks.
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Inspired by the Dynamic Sparse Training, DSnoT minimizes the reconstruction error between the dense and sparse LLMs, in the fashion of performing iterative weight pruning-and-growing on top of sparse LLMs.
Filter Pruning For CNN With Enhanced Linear Representation Redundancy
In this paper, we propose a new structured pruning method.
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Large Language Models (LLMs), renowned for their remarkable performance across diverse domains, present a challenge when it comes to practical deployment due to their colossal model size.
SWAP: Sparse Entropic Wasserstein Regression for Robust Network Pruning
This study addresses the challenge of inaccurate gradients in computing the empirical Fisher Information Matrix during neural network pruning.
Feather: An Elegant Solution to Effective DNN Sparsification
Neural Network pruning is an increasingly popular way for producing compact and efficient models, suitable for resource-limited environments, while preserving high performance.
EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection
Various researchers made use of machine learning methods in an attempt to detect COVID-19.
A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.
Distilled Pruning: Using Synthetic Data to Win the Lottery
This work introduces a novel approach to pruning deep learning models by using distilled data.