Search Results for author: Sravan Bodapati

Found 26 papers, 1 papers with code

Multi-teacher Distillation for Multilingual Spelling Correction

no code implementations20 Nov 2023 Jingfen Zhang, Xuan Guo, Sravan Bodapati, Christopher Potts

Accurate spelling correction is a critical step in modern search interfaces, especially in an era of mobile devices and speech-to-text interfaces.

Multilingual NLP Spelling Correction

Generalized zero-shot audio-to-intent classification

no code implementations4 Nov 2023 Veera Raghavendra Elluru, Devang Kulshreshtha, Rohit Paturi, Sravan Bodapati, Srikanth Ronanki

Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2. 75% and 18. 2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.

Classification Goal-Oriented Dialog +5

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

no code implementations3 Jul 2023 Devang Kulshreshtha, Saket Dingliwal, Brady Houston, Sravan Bodapati

A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom entities.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Masked Audio Text Encoders are Effective Multi-Modal Rescorers

no code implementations11 May 2023 Jinglun Cai, Monica Sunkara, Xilai Li, Anshu Bhatia, Xiao Pan, Sravan Bodapati

Masked Language Models (MLMs) have proven to be effective for second-pass rescoring in Automatic Speech Recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR

no code implementations18 Apr 2023 Xilai Li, Goeric Huybrechts, Srikanth Ronanki, Jeff Farris, Sravan Bodapati

Overall, our proposed model reduces the degradation of the streaming mode over the non-streaming full-contextual model from 41. 7% and 45. 7% to 16. 7% and 26. 2% on the LibriSpeech test-clean and test-other datasets respectively, while improving by a relative 15. 5% WER over the previous state-of-the-art unified model.

speech-recognition Speech Recognition

Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale

1 code implementation18 Dec 2022 Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth

Using a 66 billion parameter language model (OPT-66B) across a diverse set of 14 downstream tasks, we find this is indeed the case: $\sim$70% of attention heads and $\sim$20% of feed forward networks can be removed with minimal decline in task performance.

In-Context Learning Language Modelling +1

Device Directedness with Contextual Cues for Spoken Dialog Systems

no code implementations23 Nov 2022 Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff

In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Towards Personalization of CTC Speech Recognition Models with Contextual Adapters and Adaptive Boosting

no code implementations18 Oct 2022 Saket Dingliwal, Monica Sunkara, Sravan Bodapati, Srikanth Ronanki, Jeff Farris, Katrin Kirchhoff

End-to-end speech recognition models trained using joint Connectionist Temporal Classification (CTC)-Attention loss have gained popularity recently.

speech-recognition Speech Recognition

Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems

no code implementations16 Dec 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Prompt-tuning in ASR systems for efficient domain-adaptation

no code implementations13 Oct 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Remember the context! ASR slot error correction through memorization

no code implementations10 Sep 2021 Dhanush Bekal, Ashish Shenoy, Monica Sunkara, Sravan Bodapati, Katrin Kirchhoff

Accurate recognition of slot values such as domain specific words or named entities by automatic speech recognition (ASR) systems forms the core of the Goal-oriented Dialogue Systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

ASR Adaptation for E-commerce Chatbots using Cross-Utterance Context and Multi-Task Language Modeling

no code implementations ACL (ECNLP) 2021 Ashish Shenoy, Sravan Bodapati, Katrin Kirchhoff

In this paper, we investigate various techniques to improve contextualization, content word robustness and domain adaptation of a Transformer-XL neural language model (NLM) to rescore ASR N-best hypotheses.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Adapting Long Context NLM for ASR Rescoring in Conversational Agents

no code implementations21 Apr 2021 Ashish Shenoy, Sravan Bodapati, Monica Sunkara, Srikanth Ronanki, Katrin Kirchhoff

Neural Language Models (NLM), when trained and evaluated with context spanning multiple utterances, have been shown to consistently outperform both conventional n-gram language models and NLMs that use limited context.

intent-classification Intent Classification +2

Contextual Biasing of Language Models for Speech Recognition in Goal-Oriented Conversational Agents

no code implementations18 Mar 2021 Ashish Shenoy, Sravan Bodapati, Katrin Kirchhoff

In this paper, we explore different ways to incorporate context into a LSTM based NLM in order to model long range dependencies and improve speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Neural Inverse Text Normalization

no code implementations12 Feb 2021 Monica Sunkara, Chaitanya Shivade, Sravan Bodapati, Katrin Kirchhoff

We propose an efficient and robust neural solution for ITN leveraging transformer based seq2seq models and FST-based text normalization techniques for data preparation.

Towards Semi-Supervised Semantics Understanding from Speech

no code implementations11 Nov 2020 Cheng-I Lai, Jin Cao, Sravan Bodapati, Shang-Wen Li

Much recent work on Spoken Language Understanding (SLU) falls short in at least one of three ways: models were trained on oracle text input and neglected the Automatics Speech Recognition (ASR) outputs, models were trained to predict only intents without the slot values, or models were trained on a large amount of in-house data.

speech-recognition Speech Recognition +1

Multimodal Semi-supervised Learning Framework for Punctuation Prediction in Conversational Speech

no code implementations3 Aug 2020 Monica Sunkara, Srikanth Ronanki, Dhanush Bekal, Sravan Bodapati, Katrin Kirchhoff

Experiments conducted on the Fisher corpus show that our proposed approach achieves ~6-9% and ~3-4% absolute improvement (F1 score) over the baseline BLSTM model on reference transcripts and ASR outputs respectively.

Data Augmentation

Zero-Shot Reinforcement Learning with Deep Attention Convolutional Neural Networks

no code implementations2 Jan 2020 Sahika Genc, Sunil Mallya, Sravan Bodapati, Tao Sun, Yunzhe Tao

Simulation-to-simulation and simulation-to-real world transfer of neural network models have been a difficult problem.

Autonomous Driving Deep Attention +4

Robustness to Capitalization Errors in Named Entity Recognition

no code implementations WS 2019 Sravan Bodapati, Hyokun Yun, Yaser Al-Onaizan

Robustness to capitalization errors is a highly desirable characteristic of named entity recognizers, yet we find standard models for the task are surprisingly brittle to such noise.

Data Augmentation named-entity-recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.