Search Results for author: Ankur Gandhe

Found 15 papers, 1 papers with code

USTED: Improving ASR with a Unified Speech and Text Encoder-Decoder

no code implementations12 Feb 2022 Bolaji Yusuf, Ankur Gandhe, Alex Sokolov

There has been a recent focus on training E2E ASR models that get the performance benefits of external text data without incurring the extra cost of evaluating an external language model at inference time.

Machine Translation Speech Recognition

A Likelihood Ratio based Domain Adaptation Method for E2E Models

no code implementations10 Jan 2022 Chhavi Choudhury, Ankur Gandhe, Xiaohan Ding, Ivan Bulyko

In this work, we explore a contextual biasing approach using likelihood-ratio that leverages text data sources to adapt RNN-T model to new domains and entities.

Automatic Speech Recognition Domain Adaptation

Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems

no code implementations16 Dec 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead.

Automatic Speech Recognition Domain Adaptation

Lattention: Lattice-attention in ASR rescoring

no code implementations19 Nov 2021 Prabhat Pandey, Sergio Duarte Torres, Ali Orkan Bayer, Ankur Gandhe, Volker Leutnant

The rescoring model with attention to lattices achieves 4-5% relative word error rate reduction over first-pass and 6-8% with attention to both lattices and acoustic features.

Automatic Speech Recognition Spoken Language Understanding +1

Prompt-tuning in ASR systems for efficient domain-adaptation

no code implementations13 Oct 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain.

Automatic Speech Recognition Domain Adaptation

Attention-based Contextual Language Model Adaptation for Speech Recognition

1 code implementation Findings (ACL) 2021 Richard Diehl Martinez, Scott Novotney, Ivan Bulyko, Ariya Rastrow, Andreas Stolcke, Ankur Gandhe

When applied to a large de-identified dataset of utterances collected by a popular voice assistant platform, our method reduces perplexity by 7. 0% relative over a standard LM that does not incorporate contextual information.

Automatic Speech Recognition voice assistant

Personalization Strategies for End-to-End Speech Recognition Systems

no code implementations15 Feb 2021 Aditya Gourav, Linda Liu, Ankur Gandhe, Yile Gu, Guitang Lan, Xiangyang Huang, Shashank Kalmane, Gautam Tiwari, Denis Filimonov, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko

We also describe a novel second-pass de-biasing approach: used in conjunction with a first-pass shallow fusion that optimizes on oracle WER, we can achieve an additional 14% improvement on personalized content recognition, and even improve accuracy for the general use case by up to 2. 5%.

Speech Recognition

Domain-aware Neural Language Models for Speech Recognition

no code implementations5 Jan 2021 Linda Liu, Yile Gu, Aditya Gourav, Ankur Gandhe, Shashank Kalmane, Denis Filimonov, Ariya Rastrow, Ivan Bulyko

As voice assistants become more ubiquitous, they are increasingly expected to support and perform well on a wide variety of use-cases across different domains.

Domain Adaptation Speech Recognition

Improving accuracy of rare words for RNN-Transducer through unigram shallow fusion

no code implementations30 Nov 2020 Vijay Ravi, Yile Gu, Ankur Gandhe, Ariya Rastrow, Linda Liu, Denis Filimonov, Scott Novotney, Ivan Bulyko

We show that this simple method can improve performance on rare words by 3. 7% WER relative without degradation on general test set, and the improvement from USF is additive to any additional language model based rescoring.

Automatic Speech Recognition

Multi-task Language Modeling for Improving Speech Recognition of Rare Words

no code implementations23 Nov 2020 Chao-Han Huck Yang, Linda Liu, Ankur Gandhe, Yile Gu, Anirudh Raju, Denis Filimonov, Ivan Bulyko

We show that our rescoring model trained with these additional tasks outperforms the baseline rescoring model, trained with only the language modeling task, by 1. 4% on a general test and by 2. 6% on a rare word test set in terms of word-error-rate relative (WERR).

Automatic Speech Recognition Multi-Task Learning

Audio-attention discriminative language model for ASR rescoring

no code implementations6 Dec 2019 Ankur Gandhe, Ariya Rastrow

In this work, we propose to combine the benefits of end-to-end approaches with a conventional system using an attention-based discriminative language model that learns to rescore the output of a first-pass ASR system.

Automatic Speech Recognition

Scalable language model adaptation for spoken dialogue systems

no code implementations11 Dec 2018 Ankur Gandhe, Ariya Rastrow, Bjorn Hoffmeister

New application intents and interaction types are released for these systems over time, imposing challenges to adapt the LMs since the existing training data is no longer sufficient to model the future user interactions.

Speech Recognition Spoken Dialogue Systems

Contextual Language Model Adaptation for Conversational Agents

no code implementations26 Jun 2018 Anirudh Raju, Behnam Hedayatnia, Linda Liu, Ankur Gandhe, Chandra Khatri, Angeliki Metallinou, Anu Venkatesh, Ariya Rastrow

Statistical language models (LM) play a key role in Automatic Speech Recognition (ASR) systems used by conversational agents.

Automatic Speech Recognition

Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding

no code implementations1 Nov 2017 Anjishnu Kumar, Arpit Gupta, Julian Chan, Sam Tucker, Bjorn Hoffmeister, Markus Dreyer, Stanislav Peshterliev, Ankur Gandhe, Denis Filiminov, Ariya Rastrow, Christian Monson, Agnika Kumar

This paper presents the design of the machine learning architecture that underlies the Alexa Skills Kit (ASK) a large scale Spoken Language Understanding (SLU) Software Development Kit (SDK) that enables developers to extend the capabilities of Amazon's virtual assistant, Alexa.

Spoken Language Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.