Model Compression

Neural Network Compression Framework

Introduced by Kozlov et al. in Neural Network Compression Framework for fast model inference

Neural Network Compression Framework, or NNCF, is a Python-based framework for neural network compression with fine-tuning. It leverages recent advances of various network compression methods and implements some of them, namely quantization, sparsity, filter pruning and binarization. These methods allow producing more hardware-friendly models that can be efficiently run on general-purpose hardware computation units (CPU, GPU) or specialized deep learning accelerators.

Source: Neural Network Compression Framework for fast model inference

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Collaborative Filtering 1 20.00%
Link Prediction 1 20.00%
Retrieval 1 20.00%
Neural Network Compression 1 20.00%
Quantization 1 20.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories