no code implementations • 4 Mar 2019 • Zhi-Gang Liu, Matthew Mattina
The Straight-Through Estimator (STE) is widely used for back-propagating gradients through the quantization function, but the STE technique lacks a complete theoretical understanding.
no code implementations • 24 Jan 2020 • Urmish Thakker, Paul N. Whatmough, Zhi-Gang Liu, Matthew Mattina, Jesse Beu
Kronecker Products (KP) have been used to compress IoT RNN Applications by 15-38x compression factors, achieving better results than traditional compression methods.
no code implementations • 16 May 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina
Convolutional neural network (CNN) inference on mobile devices demands efficient hardware acceleration of low-precision (INT8) general matrix multiplication (GEMM).
no code implementations • ECCV 2020 • Zhi-Gang Liu, Matthew Mattina
Prior research has shown that Winograd algorithm can reduce the computational complexity of convolutional neural networks (CNN) with weights and activations represented in floating point.
no code implementations • 4 Sep 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina
In this paper, we address a key architectural challenge with structural sparsity: how to provide support for a range of sparsity levels while maintaining high utilization of the hardware.
no code implementations • 16 Jul 2021 • Zhi-Gang Liu, Paul N. Whatmough, Yuhao Zhu, Matthew Mattina
We propose to exploit structured sparsity, more specifically, Density Bound Block (DBB) sparsity for both weights and activations.