Search Results for author: Grant P. Strimel

Found 14 papers, 0 papers with code

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach

no code implementations17 Oct 2022 Kai Zhen, Martin Radfar, Hieu Duy Nguyen, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

For on-device automatic speech recognition (ASR), quantization aware training (QAT) is ubiquitous to achieve the trade-off between model predictive performance and efficiency.

Automatic Speech Recognition Quantization +1

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

no code implementations29 Sep 2022 Martin Radfar, Rohit Barnwal, Rupak Vignesh Swaminathan, Feng-Ju Chang, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and between attention layers.

speech-recognition Speech Recognition

Compute Cost Amortized Transformer for Streaming ASR

no code implementations5 Jul 2022 Yi Xie, Jonathan Macoskey, Martin Radfar, Feng-Ju Chang, Brian King, Ariya Rastrow, Athanasios Mouchtaris, Grant P. Strimel

We present a streaming, Transformer-based end-to-end automatic speech recognition (ASR) architecture which achieves efficient neural inference through compute cost amortization.

Automatic Speech Recognition speech-recognition

Latency Control for Keyword Spotting

no code implementations15 Jun 2022 Christin Jose, Joseph Wang, Grant P. Strimel, Mohammad Omar Khursheed, Yuriy Mishchenko, Brian Kulis

We also show that when our approach is used in conjunction with a max-pooling loss, we are able to improve relative false accepts by 25 % at a fixed latency when compared to cross entropy loss.

Keyword Spotting

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

no code implementations1 Apr 2022 Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra

In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system.

Automatic Speech Recognition Natural Language Understanding +2

Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization

no code implementations3 Aug 2021 Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks.

Inference Optimization Keyword Spotting +3

Amortized Neural Networks for Low-Latency Speech Recognition

no code implementations3 Aug 2021 Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow

We apply AmNets to the Recurrent Neural Network Transducer (RNN-T) to reduce compute cost and latency for an automatic speech recognition (ASR) task.

Automatic Speech Recognition speech-recognition

Learning a Neural Diff for Speech Models

no code implementations3 Aug 2021 Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

As more speech processing applications execute locally on edge devices, a set of resource constraints must be considered.

Automatic Speech Recognition Model Compression +2

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

no code implementations14 Jun 2021 Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

We find that tandem training of teacher and student encoders with an inplace encoder distillation outperforms the use of a pre-trained and static teacher transducer.

Knowledge Distillation speech-recognition +1

Semantic Complexity in End-to-End Spoken Language Understanding

no code implementations6 Aug 2020 Joseph P. McKenna, Samridhi Choudhary, Michael Saxon, Grant P. Strimel, Athanasios Mouchtaris

We perform experiments where we vary the semantic complexity of a large, proprietary dataset and show that STI model performance correlates with our semantic complexity measures, such that performance increases as complexity values decrease.

Spoken Language Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.