no code implementations • 13 Jun 2024 • Jari Kolehmainen, Aditya Gourav, Prashanth Gurunath Shivakumar, Yile Gu, Ankur Gandhe, Ariya Rastrow, Grant Strimel, Ivan Bulyko
Retrieval is a widely adopted approach for improving language models leveraging external information.
no code implementations • 19 Jan 2024 • Yu Yu, Chao-Han Huck Yang, Tuan Dinh, Sungho Ryu, Jari Kolehmainen, Roger Ren, Denis Filimonov, Prashanth G. Shivakumar, Ankur Gandhe, Ariya Rastow, Jia Xu, Ivan Bulyko, Andreas Stolcke
The use of low-rank adaptation (LoRA) with frozen pretrained language models (PLMs) has become increasing popular as a mainstream, resource-efficient modeling approach for memory-constrained hardware.
no code implementations • 5 Jan 2024 • Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke
In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text.
no code implementations • 23 Dec 2023 • Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko
Specifically, our framework serializes tasks in the order of current paralinguistic attribute prediction, response paralinguistic attribute prediction, and response text generation with autoregressive conditioning.
no code implementations • 10 Oct 2023 • Prashanth Gurunath Shivakumar, Jari Kolehmainen, Yile Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
In this study, we propose and explore several discriminative fine-tuning schemes for pre-trained LMs.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 26 Sep 2023 • Yu Yu, Chao-Han Huck Yang, Jari Kolehmainen, Prashanth G. Shivakumar, Yile Gu, Sungho Ryu, Roger Ren, Qi Luo, Aditya Gourav, I-Fan Chen, Yi-Chieh Liu, Tuan Dinh, Ankur Gandhe, Denis Filimonov, Shalini Ghosh, Andreas Stolcke, Ariya Rastow, Ivan Bulyko
We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring.
no code implementations • 13 Jul 2023 • Jari Kolehmainen, Yile Gu, Aditya Gourav, Prashanth Gurunath Shivakumar, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline.
no code implementations • 27 Jun 2023 • Yile Gu, Prashanth Gurunath Shivakumar, Jari Kolehmainen, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
We study whether this scaling property is also applicable to second-pass rescoring, which is an important component of speech recognition systems.
no code implementations • 15 Jun 2023 • Prashanth Gurunath Shivakumar, Jari Kolehmainen, Yile Gu, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko
We also show that the proposed distillation can reduce the WER gap between the student and the teacher by 62% upto 100%.
no code implementations • 2 Jun 2023 • Denis Filimonov, Prabhat Pandey, Ariya Rastrow, Ankur Gandhe, Andreas Stolcke
In interactive automatic speech recognition (ASR) systems, low-latency requirements limit the amount of search space that can be explored during decoding, particularly in end-to-end neural ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 9 May 2023 • Xuandi Fu, Kanthashree Mysore Sathyendra, Ankur Gandhe, Jing Liu, Grant P. Strimel, Ross McGowan, Athanasios Mouchtaris
Prior approaches typically relied on subword encoders for encoding the bias phrases.
no code implementations • 30 Mar 2023 • Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant Strimel, Andreas Stolcke, Ivan Bulyko
End-to-End (E2E) automatic speech recognition (ASR) systems used in voice assistants often have difficulties recognizing infrequent words personalized to the user, such as names and places.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 20 Mar 2023 • Bolaji Yusuf, Aditya Gourav, Ankur Gandhe, Ivan Bulyko
End-to-end speech recognition models are improved by incorporating external text sources, typically by fusion with an external language model.
no code implementations • 12 Feb 2022 • Bolaji Yusuf, Ankur Gandhe, Alex Sokolov
There has been a recent focus on training E2E ASR models that get the performance benefits of external text data without incurring the extra cost of evaluating an external language model at inference time.
no code implementations • 2 Feb 2022 • Liyan Xu, Yile Gu, Jari Kolehmainen, Haidar Khan, Ankur Gandhe, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko
Specifically, training a bidirectional model like BERT on a discriminative objective such as minimum WER (MWER) has not been explored.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 10 Jan 2022 • Chhavi Choudhury, Ankur Gandhe, Xiaohan Ding, Ivan Bulyko
In this work, we explore a contextual biasing approach using likelihood-ratio that leverages text data sources to adapt RNN-T model to new domains and entities.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 16 Dec 2021 • Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff
Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 19 Nov 2021 • Prabhat Pandey, Sergio Duarte Torres, Ali Orkan Bayer, Ankur Gandhe, Volker Leutnant
The rescoring model with attention to lattices achieves 4-5% relative word error rate reduction over first-pass and 6-8% with attention to both lattices and acoustic features.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 13 Oct 2021 • Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff
In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • Findings (ACL) 2021 • Richard Diehl Martinez, Scott Novotney, Ivan Bulyko, Ariya Rastrow, Andreas Stolcke, Ankur Gandhe
When applied to a large de-identified dataset of utterances collected by a popular voice assistant platform, our method reduces perplexity by 7. 0% relative over a standard LM that does not incorporate contextual information.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Feb 2021 • Aditya Gourav, Linda Liu, Ankur Gandhe, Yile Gu, Guitang Lan, Xiangyang Huang, Shashank Kalmane, Gautam Tiwari, Denis Filimonov, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko
We also describe a novel second-pass de-biasing approach: used in conjunction with a first-pass shallow fusion that optimizes on oracle WER, we can achieve an additional 14% improvement on personalized content recognition, and even improve accuracy for the general use case by up to 2. 5%.
no code implementations • 5 Jan 2021 • Linda Liu, Yile Gu, Aditya Gourav, Ankur Gandhe, Shashank Kalmane, Denis Filimonov, Ariya Rastrow, Ivan Bulyko
As voice assistants become more ubiquitous, they are increasingly expected to support and perform well on a wide variety of use-cases across different domains.
no code implementations • 30 Nov 2020 • Vijay Ravi, Yile Gu, Ankur Gandhe, Ariya Rastrow, Linda Liu, Denis Filimonov, Scott Novotney, Ivan Bulyko
We show that this simple method can improve performance on rare words by 3. 7% WER relative without degradation on general test set, and the improvement from USF is additive to any additional language model based rescoring.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 23 Nov 2020 • Chao-Han Huck Yang, Linda Liu, Ankur Gandhe, Yile Gu, Anirudh Raju, Denis Filimonov, Ivan Bulyko
We show that our rescoring model trained with these additional tasks outperforms the baseline rescoring model, trained with only the language modeling task, by 1. 4% on a general test and by 2. 6% on a rare word test set in terms of word-error-rate relative (WERR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 6 Dec 2019 • Ankur Gandhe, Ariya Rastrow
In this work, we propose to combine the benefits of end-to-end approaches with a conventional system using an attention-based discriminative language model that learns to rescore the output of a first-pass ASR system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Dec 2018 • Ankur Gandhe, Ariya Rastrow, Bjorn Hoffmeister
New application intents and interaction types are released for these systems over time, imposing challenges to adapt the LMs since the existing training data is no longer sufficient to model the future user interactions.
no code implementations • 26 Jun 2018 • Anirudh Raju, Behnam Hedayatnia, Linda Liu, Ankur Gandhe, Chandra Khatri, Angeliki Metallinou, Anu Venkatesh, Ariya Rastrow
Statistical language models (LM) play a key role in Automatic Speech Recognition (ASR) systems used by conversational agents.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Nov 2017 • Anjishnu Kumar, Arpit Gupta, Julian Chan, Sam Tucker, Bjorn Hoffmeister, Markus Dreyer, Stanislav Peshterliev, Ankur Gandhe, Denis Filiminov, Ariya Rastrow, Christian Monson, Agnika Kumar
This paper presents the design of the machine learning architecture that underlies the Alexa Skills Kit (ASK) a large scale Spoken Language Understanding (SLU) Software Development Kit (SDK) that enables developers to extend the capabilities of Amazon's virtual assistant, Alexa.