TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference

2021 IEEE Workshop on Signal Processing Systems (SiPS) 2021 · Seokhyeon Choi, Kyuhong Shim, Jungwook Choi, Wonyong Sung, Byonghyo Shim ·

Efficient implementation of deep neural networks on CPU-based systems is very critical because applications proliferate to embedded and Internet of Things (IoT) systems. Many CPUs for personal computers and embedded systems equip Single Instruction Multiple Data (SIMD) instructions, which can be utilized to implement an efficient GEneral Matrix Multiply (GEMM) library that is very necessary for efficient deep neural network implementation. While many deep neural networks show quite good performance even at 1-bit or 2-bit precision, the current CPU instruction and library do not efficiently support arithmetic operations below 8-bit. We propose TernGEMM, a special GEMM library using SIMD instructions for Deep Neural Network (DNN) inference with ternary weights and activations under 8-bit. TernGEMM improves the speed by replacing slow multiply-add with logical operations and also accumulating a number of multiplications without bit expansion operations. We compared the speedup of TernGEMM with tiling optimization and GEMMLowp, an 8-bit precision GEMM library. For Intel CPU, the speedup of ×2.052, ×2.973, and ×2.986 is achieved on ResNet-50, MobileNet-V2, EfficientNet-B0, respectively. For ARM CPU, TernGEMM’s speedup is ×2.143, ×1.765, and ×1.856, respectively.

PDF

Code

Add Remove Mark official

aiha-lab/TernGEMM

Tasks

Add Remove

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

SPEED

Edit Social Preview

TernGEMM: GEneral Matrix Multiply Library with Ternary Weights for Fast DNN Inference

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove