Learnable Lookup Table for Neural Network Quantization

Neural network quantization aims at reducing bit-widths of weights and activations for memory and computational efficiency. Since a linear quantizer (i.e., round(*) function) cannot well fit the bell-shaped distributions of weights and activations, many existing methods use pre-defined functions (e.g., exponential function) with learnable parameters to build the quantizer for joint optimization. However, these complicated quantizers introduce considerable computational overhead during inference since activation quantization should be conducted online. In this paper, we formulate the quantization process as a simple lookup operation and propose to learn lookup tables as quantizers. Specifically, we develop differentiable lookup tables and introduce several training strategies for optimization. Our lookup tables can be trained with the network in an end-to-end manner to fit the distributions in different layers and have very small additional computational cost. Comparison with previous methods show that quantized networks using our lookup tables achieve state-of-the-art performance on image classification, image super-resolution, and point cloud classification tasks.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here