Neural Network Compression
74 papers with code • 1 benchmarks • 1 datasets
Libraries
Use these libraries to find Neural Network Compression models and implementationsMost implemented papers
spred: Solving $L_1$ Penalty with SGD
We propose to minimize a generic differentiable objective with $L_1$ constraint using a simple reparametrization and straightforward stochastic gradient descent.
StrassenNets: Deep Learning with a Multiplication Budget
A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers.
Deep Neural Network Compression with Single and Multiple Level Quantization
In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary). We are the first to consider the network quantization from both width and depth level.
Characterising Across-Stack Optimisations for Deep Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are extremely computationally demanding, presenting a large barrier to their deployment on resource-constrained devices.
Differentiable Fine-grained Quantization for Deep Neural Network Compression
Thus judiciously selecting different precision for different layers/structures can potentially produce more efficient models compared to traditional quantization methods by striking a better balance between accuracy and compression rate.
Efficient Neural Network Compression
The better accuracy and complexity compromise, as well as the extremely fast speed of our method makes it suitable for neural network compression.
Few Sample Knowledge Distillation for Efficient Network Compression
Deep neural network compression techniques such as pruning and weight tensor decomposition usually require fine-tuning to recover the prediction accuracy when the compression ratio is high.
DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression
In this paper, we propose DeepSZ: an accuracy-loss bounded neural network compression framework, which involves four key steps: network pruning, error bound assessment, optimization for error bound configuration, and compressed model generation, featuring a high compression ratio and low encoding time.
Focused Quantization for Sparse CNNs
In ResNet-50, we achieved a 18. 08x CR with only 0. 24% loss in top-5 accuracy, outperforming existing compression methods.
COP: Customized Deep Model Compression via Regularized Correlation-Based Filter-Level Pruning
2) Cross-layer filter comparison is unachievable since the importance is defined locally within each layer.