Search Results for author: Grant P. Strimel

Found 20 papers, 0 papers with code

Search Optimization with Query Likelihood Boosting and Two-Level Approximate Search for Edge Devices

no code implementations • 12 Dec 2023 • Jianwei Zhang, Helian Feng, Xin He, Grant P. Strimel, Farhad Ghassemi, Ali Kebarighotbi

We present a novel search optimization solution for approximate nearest neighbor (ANN) search on resource-constrained edge devices.

Retrieval

Paper
Add Code

Accelerator-Aware Training for Transducer-Based Speech Recognition

no code implementations • 12 May 2023 • Suhaila M. Shakiah, Rupak Vignesh Swaminathan, Hieu Duy Nguyen, Raviteja Chinta, Tariq Afzal, Nathan Susanj, Athanasios Mouchtaris, Grant P. Strimel, Ariya Rastrow

Machine learning model weights and activations are represented in full-precision during training.

Quantization speech-recognition +1

Paper
Add Code

Robust Acoustic and Semantic Contextual Biasing in Neural Transducers for Speech Recognition

no code implementations • 9 May 2023 • Xuandi Fu, Kanthashree Mysore Sathyendra, Ankur Gandhe, Jing Liu, Grant P. Strimel, Ross McGowan, Athanasios Mouchtaris

Prior approaches typically relied on subword encoders for encoding the bias phrases.

Automatic Speech Recognition Language Modelling +2

Paper
Add Code

Lookahead When It Matters: Adaptive Non-causal Transformers for Streaming Neural Transducers

no code implementations • 7 May 2023 • Grant P. Strimel, Yi Xie, Brian King, Martin Radfar, Ariya Rastrow, Athanasios Mouchtaris

Streaming speech recognition architectures are employed for low-latency, real-time applications.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition

no code implementations • 3 Apr 2023 • Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree M. Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann

We present dual-attention neural biasing, an architecture designed to boost Wake Words (WW) recognition and improve inference time latency on speech recognition tasks.

speech-recognition Speech Recognition

Paper
Add Code

Dialog act guided contextual adapter for personalized speech recognition

no code implementations • 31 Mar 2023 • Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant P. Strimel, Ross McGowan

Specifically, it leverages dialog acts to select the most relevant user catalogs and creates queries based on both -- the audio as well as the semantic relationship between the carrier phrase and user catalogs to better guide the contextual biasing.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach

no code implementations • 17 Oct 2022 • Kai Zhen, Martin Radfar, Hieu Duy Nguyen, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

For on-device automatic speech recognition (ASR), quantization aware training (QAT) is ubiquitous to achieve the trade-off between model predictive performance and efficiency.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

no code implementations • 29 Sep 2022 • Martin Radfar, Rohit Barnwal, Rupak Vignesh Swaminathan, Feng-Ju Chang, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and between attention layers.

speech-recognition Speech Recognition

Paper
Add Code

Compute Cost Amortized Transformer for Streaming ASR

no code implementations • 5 Jul 2022 • Yi Xie, Jonathan Macoskey, Martin Radfar, Feng-Ju Chang, Brian King, Ariya Rastrow, Athanasios Mouchtaris, Grant P. Strimel

We present a streaming, Transformer-based end-to-end automatic speech recognition (ASR) architecture which achieves efficient neural inference through compute cost amortization.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Latency Control for Keyword Spotting

no code implementations • 15 Jun 2022 • Christin Jose, Joseph Wang, Grant P. Strimel, Mohammad Omar Khursheed, Yuriy Mishchenko, Brian Kulis

We also show that when our approach is used in conjunction with a max-pooling loss, we are able to improve relative false accepts by 25 % at a fixed latency when compared to cross entropy loss.

Keyword Spotting

Paper
Add Code

Contextual Adapters for Personalized Speech Recognition in Neural Transducers

no code implementations • 26 May 2022 • Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Feng-Ju Chang, Jing Liu, Jinru Su, Grant P. Strimel, Athanasios Mouchtaris, Siegfried Kunzmann

Personal rare word recognition in end-to-end Automatic Speech Recognition (E2E ASR) models is a challenge due to the lack of training data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

A neural prosody encoder for end-ro-end dialogue act classification

no code implementations • 11 May 2022 • Kai Wei, Dillon Knox, Martin Radfar, Thanh Tran, Markus Muller, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris, Maurizio Omologo

Dialogue act classification (DAC) is a critical task for spoken language understanding in dialogue systems.

Dialogue Act Classification Spoken Language Understanding

Paper
Add Code

Multi-task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding

no code implementations • 1 Apr 2022 • Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra

In addition, the NLU model in the two-stage system is not streamable, as it must wait for the audio segments to complete processing, which ultimately impacts the latency of the SLU system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Attentive Contextual Carryover for Multi-Turn End-to-End Spoken Language Understanding

no code implementations • 13 Dec 2021 • Kai Wei, Thanh Tran, Feng-Ju Chang, Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Jing Liu, Anirudh Raju, Ross McGowan, Nathan Susanj, Ariya Rastrow, Grant P. Strimel

Recent years have seen significant advances in end-to-end (E2E) spoken language understanding (SLU) systems, which directly predict intents and slots from spoken audio.

Natural Language Understanding Spoken Language Understanding

Paper
Add Code

Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization

no code implementations • 3 Aug 2021 • Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks.

Inference Optimization Keyword Spotting +3

Paper
Add Code

Learning a Neural Diff for Speech Models

no code implementations • 3 Aug 2021 • Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

As more speech processing applications execute locally on edge devices, a set of resource constraints must be considered.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Amortized Neural Networks for Low-Latency Speech Recognition

no code implementations • 3 Aug 2021 • Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow

We apply AmNets to the Recurrent Neural Network Transducer (RNN-T) to reduce compute cost and latency for an automatic speech recognition (ASR) task.

Ranked #55 on Speech Recognition on LibriSpeech test-clean

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

no code implementations • 14 Jun 2021 • Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

We find that tandem training of teacher and student encoders with an inplace encoder distillation outperforms the use of a pre-trained and static teacher transducer.

Knowledge Distillation speech-recognition +1

Paper
Add Code

Semantic Complexity in End-to-End Spoken Language Understanding

no code implementations • 6 Aug 2020 • Joseph P. McKenna, Samridhi Choudhary, Michael Saxon, Grant P. Strimel, Athanasios Mouchtaris

We perform experiments where we vary the semantic complexity of a large, proprietary dataset and show that STI model performance correlates with our semantic complexity measures, such that performance increases as complexity values decrease.

Spoken Language Understanding

Paper
Add Code

Statistical Model Compression for Small-Footprint Natural Language Understanding

no code implementations • 19 Jul 2018 • Grant P. Strimel, Kanthashree Mysore Sathyendra, Stanislav Peshterliev

In this paper we investigate statistical model compression applied to natural language understanding (NLU) models.

Model Compression Natural Language Understanding +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.