Neural Network Compression
74 papers with code • 1 benchmarks • 1 datasets
Libraries
Use these libraries to find Neural Network Compression models and implementationsLatest papers
PD-Quant: Post-Training Quantization based on Prediction Difference Metric
It determines the quantization parameters by using the information of differences between network prediction before and after quantization.
spred: Solving $L_1$ Penalty with SGD
We propose to minimize a generic differentiable objective with $L_1$ constraint using a simple reparametrization and straightforward stochastic gradient descent.
SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks
In this paper, we perform an empirical evaluation on methods for sharing parameters in isotropic networks (SPIN).
Wavelet Feature Maps Compression for Image-to-Image CNNs
Convolutional Neural Networks (CNNs) are known for requiring extensive computational resources, and quantization is among the best and most common methods for compressing them.
Revisiting Random Channel Pruning for Neural Network Compression
The proposed approach provides a new way to compare different methods, namely how well they behave compared with random pruning.
Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction
Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of the gradients.
Neural Network Compression of ACAS Xu Early Prototype is Unsafe: Closed-Loop Verification through Quantized State Backreachability
Analysis of this system has spurred a significant body of research in the formal methods community on neural network verification.
NeRV: Neural Representations for Videos
In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).
CHIP: CHannel Independence-based Pruning for Compact Neural Networks
Filter pruning has been widely used for neural network compression because of its enabled practical acceleration.
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation
Despite this advancement in different techniques for distilling the knowledge, the aggregation of different paths for distillation has not been studied comprehensively.