Quantization
1032 papers with code • 10 benchmarks • 18 datasets
Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).
Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers
Libraries
Use these libraries to find Quantization models and implementationsDatasets
Most implemented papers
Polysemous codes
This paper considers the problem of approximate nearest neighbor search in the compressed domain.
Learned Step Size Quantization
Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases.
Improvements to Target-Based 3D LiDAR to Camera Calibration
The homogeneous transformation between a LiDAR and monocular camera is required for sensor fusion tasks, such as SLAM.
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios.
Link and code: Fast indexing with graphs and compact regression codes
Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.
Single Path One-Shot Neural Architecture Search with Uniform Sampling
It is easy to train and fast to search.
Unsupervised Cross-lingual Representation Learning for Speech Recognition
This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.
QVRF: A Quantization-error-aware Variable Rate Framework for Learned Image Compression
In this paper, we present a Quantization-error-aware Variable Rate Framework (QVRF) that utilizes a univariate quantization regulator a to achieve wide-range variable rates within a single model.
Model compression via distillation and quantization
Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning.
Data-Free Quantization Through Weight Equalization and Bias Correction
This improves quantization accuracy performance, and can be applied to many common computer vision architectures with a straight forward API call.