Data Free Quantization
13 papers with code • 2 benchmarks • 1 datasets
Data Free Quantization is a technique to achieve a highly accurate quantized model without accessing any training data.
LibrariesUse these libraries to find Data Free Quantization models and implementations
This improves quantization accuracy performance, and can be applied to many common computer vision architectures with a straight forward API call.
We find that this is often insufficient to capture the distribution of the original data, especially around the decision boundaries.
We first give a theoretical analysis that the diversity of synthetic samples is crucial for the data-free quantization, while in existing approaches, the synthetic data completely constrained by BN statistics experimentally exhibit severe homogenization at distribution and sample levels.
This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.
The above insights guide us to design a relative value metric to optimize the Gaussian noise to approximate the real images, which are then utilized to calibrate the quantization parameters.
To deal with the performance drop induced by quantization errors, a popular method is to use training data to fine-tune quantized networks.
In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT.