|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.
Our empirical study indicates that the quantization brings information loss in both forward and backward propagation, which is the bottleneck of training accurate binary neural networks.
Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights.
In this work we present a new framework for neural networks compression with fine-tuning, which we called Neural Network Compression Framework (NNCF).
Ranked #1 on Binarization on ImageNet
While knowledge distillation (transfer) has been attracting attentions from the research community, the recent development in the fields has heightened the need for reproducible studies and highly generalized frameworks to lower barriers to such high-quality, reproducible deep learning research.
Ranked #31 on Instance Segmentation on COCO test-dev
The success of deep learning in numerous application domains created the de- sire to run and train them on mobile devices.
Structured pruning is a popular method for compressing a neural network: given a large trained network, one alternates between removing channel connections and fine-tuning; reducing the overall width of the network.