# Explore the Potential of CNN Low Bit Training

Convolutional Neural Networks (CNNs) have been widely used in many fields. However, the training process costs much energy and time, in which the convolution operations consume the major part. In this paper, we propose a low-bit training framework, in order to reduce the data bit-width for all the convolution inputs in training, including activation, weight and error. The low-bit training framework consists of dynamic multilevel quantization and low-bit tensor convolution arithmetic: the former uses multi-level scaling factors to extract the common parts of tensor data as much as possible to reduce the bit-width of each data, while the latter processes the low-bit tensor of special format to complete the convolution, and naturally gets the output results with higher accuracy in accumulation. Through this dynamic precision framework, we can reduce the bit-width of convolution, which is the most computational cost, while keeping the training process close to the full precision floating-point training. On the premise of not changing the network structure and the full precision training process, we explored experimentally the limit of reducing training bit-width with scaling and quantization and maintaining the accuracy. The experimental results show that the input data of convolution can be reduced to 1-bit mantissa and 2-bit exponent when training ResNet-20 on CIFAR-10 dataset and the accuracy remains unchanged. For Imagenet, with 3-bit mantissa and 3-bit exponent, the accuracy loss is less than $1\%$. This scheme has less bits than the previous floating-point one, and higher accuracy than the previous fixed-point one. Furthermore, the evaluation of our hardware prototype shows the framework has the potential to achieve at least $2\times$ energy effciency of the computation in the training process.

PDF Abstract

## Code Add Remove Mark official

No code implementations yet. Submit your code now