Browse > Methodology > Model Compression > Neural Network Compression

Neural Network Compression

14 papers with code · Methodology
Subtask of Model Compression

State-of-the-art leaderboards

Greatest papers with code

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

28 Jan 2019NervanaSystems/distiller

The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.

LANGUAGE MODELLING NEURAL NETWORK COMPRESSION QUANTIZATION

Learning Sparse Networks Using Targeted Dropout

31 May 2019for-ai/TD

Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights.

NETWORK PRUNING NEURAL NETWORK COMPRESSION

Soft Weight-Sharing for Neural Network Compression

13 Feb 2017KarenUllrich/Tutorial_BayesianCompressionForDL

The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices.

NEURAL NETWORK COMPRESSION QUANTIZATION

A Closer Look at Structured Pruning for Neural Network Compression

10 Oct 2018BayesWatch/pytorch-prunes

Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network.

NETWORK PRUNING NEURAL NETWORK COMPRESSION

COP: Customized Deep Model Compression via Regularized Correlation-Based Filter-Level Pruning

25 Jun 2019ZJULearning/COP

2) Cross-layer filter comparison is unachievable since the importance is defined locally within each layer.

NEURAL NETWORK COMPRESSION

Deep Neural Network Compression with Single and Multiple Level Quantization

6 Mar 2018yuhuixu1993/SLQ

In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary). We are the first to consider the network quantization from both width and depth level.

NEURAL NETWORK COMPRESSION QUANTIZATION

MUSCO: Multi-Stage Compression of neural networks

24 Mar 2019juliagusak/musco

The low-rank tensor approximation is very promising for the compression of deep neural networks.

NEURAL NETWORK COMPRESSION

Efficient Neural Network Compression

CVPR 2019 Hyeji-Kim/ENC

The better accuracy and complexity compromise, as well as the extremely fast speed of our method makes it suitable for neural network compression.

ACCURACY METRICS NEURAL NETWORK COMPRESSION

Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters

ICLR 2019 cambridge-mlg/miracle

While deep neural networks are a highly successful model class, their large memory footprint puts considerable strain on energy consumption, communication bandwidth, and storage requirements.

NEURAL NETWORK COMPRESSION QUANTIZATION