HPTQ: Hardware-Friendly Post Training Quantization

Neural network quantization enables the deployment of models on edge devices. An essential requirement for their hardware efficiency is that the quantizers are hardware-friendly: uniform, symmetric, and with power-of-two thresholds. To the best of our knowledge, current post-training quantization methods do not support all of these constraints simultaneously. In this work, we introduce a hardware-friendly post training quantization (HPTQ) framework, which addresses this problem by synergistically combining several known quantization methods. We perform a large-scale study on four tasks: classification, object detection, semantic segmentation and pose estimation over a wide variety of network architectures. Our extensive experiments show that competitive results can be obtained under hardware-friendly constraints.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Quantization ImageNet DenseNet-121 W8A8 Top-1 Accuracy (%) 73.356 # 19
Weight bits 8 # 10
Activation bits 8 # 9
Quantization ImageNet EfficientNet-B0 W8A8 Top-1 Accuracy (%) 74.216 # 16
Weight bits 8 # 10
Activation bits 8 # 9
Quantization ImageNet EfficientNet-B0 ReLU W8A8 Top-1 Accuracy (%) 77.092 # 11
Weight bits 8 # 10
Activation bits 8 # 9
Quantization ImageNet Xception W8A8 Top-1 Accuracy (%) 78.972 # 8
Weight bits 8 # 10
Activation bits 8 # 9
Quantization ImageNet MobileNetV2 W8A8 Top-1 Accuracy (%) 71.46 # 23
Weight bits 8 # 10
Activation bits 8 # 9
Quantization MS COCO SSD ResNet50 V1 FPN 640x640 MAP 34.3 # 1

Methods


No methods listed for this paper. Add relevant methods here