no code implementations • 4 Mar 2023 • Sashank Macha, Om Oza, Alex Escott, Francesco Caliva, Robbie Armitano, Santosh Kumar Cheekatmalla, Sree Hari Krishnan Parthasarathi, Yuzong Liu
Furthermore, on an in-house KWS dataset, we show that our 8-bit FXP-QAT models have a 4-6% improvement in relative false discovery rate at fixed false reject rate compared to full precision FLP models.
no code implementations • 13 Jul 2022 • Lu Zeng, Sree Hari Krishnan Parthasarathi, Yuzong Liu, Alex Escott, Santosh Kumar Cheekatmalla, Nikko Strom, Shiv Vitaladevuni
We organize our results in two embedded chipset settings: a) with commodity ARM NEON instruction set and 8-bit containers, we present accuracy, CPU, and memory results using sub 8-bit weights (4, 5, 8-bit) and 8-bit quantization of rest of the network; b) with off-the-shelf neural network accelerators, for a range of weight bit widths (1 and 5-bit), while presenting accuracy results, we project reduction in memory utilization.
no code implementations • 29 Sep 2021 • Mohammad Omar Khursheed, Christin Jose, Rajath Kumar, GengShen Fu, Brian Kulis, Santosh Kumar Cheekatmalla
In this work, we propose Tiny-CRNN (Tiny Convolutional Recurrent Neural Network) models applied to the problem of wakeword detection, and augment them with scaled dot product attention.
no code implementations • 25 Nov 2020 • Mohammad Omar Khursheed, Christin Jose, Rajath Kumar, GengShen Fu, Brian Kulis, Santosh Kumar Cheekatmalla
In this work, we propose small footprint Convolutional Recurrent Neural Network models applied to the problem of wakeword detection and augment them with scaled dot product attention.