Neural Network Compression

53 papers with code • 2 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Greatest papers with code

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

NervanaSystems/distiller 28 Jan 2019

The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.

Language Modelling Neural Network Compression +1

Forward and Backward Information Retention for Accurate Binary Neural Networks

JDAI-CV/dabnn CVPR 2020

Our empirical study indicates that the quantization brings information loss in both forward and backward propagation, which is the bottleneck of training accurate binary neural networks.

Binarization Neural Network Compression +1

torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation

yoshitomo-matsubara/torchdistill 25 Nov 2020

While knowledge distillation (transfer) has been attracting attentions from the research community, the recent development in the fields has heightened the need for reproducible studies and highly generalized frameworks to lower barriers to such high-quality, reproducible deep learning research.

Instance Segmentation Knowledge Distillation +1

Head Network Distillation: Splitting Distilled Deep Neural Networks for Resource-Constrained Edge Computing Systems

yoshitomo-matsubara/torchdistill 20 Nov 2020

In this paper, we propose to modify the structure and training process of DNN models for complex image classification tasks to achieve in-network compression in the early network layers.

Edge-computing Knowledge Distillation +1

Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems

yoshitomo-matsubara/torchdistill 1 Oct 2019

Offloading the execution of complex Deep Neural Networks (DNNs) models to compute-capable devices at the network edge, that is, edge servers, can significantly reduce capture-to-output delay.

Edge-computing Knowledge Distillation +1

Similarity-Preserving Knowledge Distillation

yoshitomo-matsubara/torchdistill ICCV 2019

Knowledge distillation is a widely applicable technique for training a student neural network under the guidance of a trained teacher network.

Knowledge Distillation Neural Network Compression

Data-Free Learning of Student Networks

huawei-noah/Data-Efficient-Model-Compression ICCV 2019

Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors.

Neural Network Compression

Neural Network Compression Framework for fast model inference

openvinotoolkit/nncf_pytorch 20 Feb 2020

In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF).

Binarization Fine-tuning +2

Learning Sparse Networks Using Targeted Dropout

for-ai/TD 31 May 2019

Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights.

Network Pruning Neural Network Compression

ZeroQ: A Novel Zero Shot Quantization Framework

amirgholami/ZeroQ CVPR 2020

Importantly, ZeroQ has a very low computational overhead, and it can finish the entire quantization process in less than 30s (0. 5\% of one epoch training time of ResNet50 on ImageNet).

Neural Network Compression Quantization