no code implementations • 28 May 2024 • Aparna Elangovan, Ling Liu, Lei Xu, Sravan Bodapati, Dan Roth
In this position paper, we argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking that draws upon insights from disciplines such as user experience research and human behavioral psychology to ensure that the experimental design and results are reliable.
no code implementations • 20 Nov 2023 • Jingfen Zhang, Xuan Guo, Sravan Bodapati, Christopher Potts
Accurate spelling correction is a critical step in modern search interfaces, especially in an era of mobile devices and speech-to-text interfaces.
no code implementations • 14 Nov 2023 • Sai Muralidhar Jayanthi, Devang Kulshreshtha, Saket Dingliwal, Srikanth Ronanki, Sravan Bodapati
Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 4 Nov 2023 • Veera Raghavendra Elluru, Devang Kulshreshtha, Rohit Paturi, Sravan Bodapati, Srikanth Ronanki
Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2. 75% and 18. 2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.
no code implementations • 3 Jul 2023 • Devang Kulshreshtha, Saket Dingliwal, Brady Houston, Sravan Bodapati
A recent approach explores Contextual Adapters, wherein an attention-based biasing model for CTC is used to improve the recognition of custom entities.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Jul 2023 • Anshu Bhatia, Sanchit Sinha, Saket Dingliwal, Karthik Gopalakrishnan, Sravan Bodapati, Katrin Kirchhoff
Speech representations learned in a self-supervised fashion from massive unlabeled speech corpora have been adapted successfully toward several downstream tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 13 Jun 2023 • Goeric Huybrechts, Srikanth Ronanki, Xilai Li, Hadis Nosrati, Sravan Bodapati, Katrin Kirchhoff
To address this issue, we propose the integration of a novel dynamic contextual carry-over mechanism in a state-of-the-art (SOTA) unified ASR system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 11 May 2023 • Jinglun Cai, Monica Sunkara, Xilai Li, Anshu Bhatia, Xiao Pan, Sravan Bodapati
Masked Language Models (MLMs) have proven to be effective for second-pass rescoring in Automatic Speech Recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 5 May 2023 • Nilaksh Das, Monica Sunkara, Sravan Bodapati, Jinglun Cai, Devang Kulshreshtha, Jeff Farris, Katrin Kirchhoff
Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T.
no code implementations • 18 Apr 2023 • Xilai Li, Goeric Huybrechts, Srikanth Ronanki, Jeff Farris, Sravan Bodapati
Overall, our proposed model reduces the degradation of the streaming mode over the non-streaming full-contextual model from 41. 7% and 45. 7% to 16. 7% and 26. 2% on the LibriSpeech test-clean and test-other datasets respectively, while improving by a relative 15. 5% WER over the previous state-of-the-art unified model.
1 code implementation • 18 Dec 2022 • Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth
Using a 66 billion parameter language model (OPT-66B) across a diverse set of 14 downstream tasks, we find this is indeed the case: $\sim$70% of attention heads and $\sim$20% of feed forward networks can be removed with minimal decline in task performance.
no code implementations • 23 Nov 2022 • Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff
In this work, we define barge-in verification as a supervised learning task where audio-only information is used to classify user spoken dialogue into true and false barge-ins.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 18 Oct 2022 • Saket Dingliwal, Monica Sunkara, Sravan Bodapati, Srikanth Ronanki, Jeff Farris, Katrin Kirchhoff
End-to-end speech recognition models trained using joint Connectionist Temporal Classification (CTC)-Attention loss have gained popularity recently.
no code implementations • 16 Dec 2021 • Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff
Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 13 Oct 2021 • Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff
In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 10 Sep 2021 • Dhanush Bekal, Ashish Shenoy, Monica Sunkara, Sravan Bodapati, Katrin Kirchhoff
Accurate recognition of slot values such as domain specific words or named entities by automatic speech recognition (ASR) systems forms the core of the Goal-oriented Dialogue Systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • ACL (ECNLP) 2021 • Ashish Shenoy, Sravan Bodapati, Katrin Kirchhoff
In this paper, we investigate various techniques to improve contextualization, content word robustness and domain adaptation of a Transformer-XL neural language model (NLM) to rescore ASR N-best hypotheses.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 21 Apr 2021 • Ashish Shenoy, Sravan Bodapati, Monica Sunkara, Srikanth Ronanki, Katrin Kirchhoff
Neural Language Models (NLM), when trained and evaluated with context spanning multiple utterances, have been shown to consistently outperform both conventional n-gram language models and NLMs that use limited context.
no code implementations • 18 Mar 2021 • Ashish Shenoy, Sravan Bodapati, Katrin Kirchhoff
In this paper, we explore different ways to incorporate context into a LSTM based NLM in order to model long range dependencies and improve speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 10 Mar 2021 • Nilaksh Das, Sravan Bodapati, Monica Sunkara, Sundararajan Srinivasan, Duen Horng Chau
Training deep neural networks for automatic speech recognition (ASR) requires large amounts of transcribed speech.
no code implementations • 12 Feb 2021 • Monica Sunkara, Chaitanya Shivade, Sravan Bodapati, Katrin Kirchhoff
We propose an efficient and robust neural solution for ITN leveraging transformer based seq2seq models and FST-based text normalization techniques for data preparation.
no code implementations • 11 Nov 2020 • Cheng-I Lai, Jin Cao, Sravan Bodapati, Shang-Wen Li
Much recent work on Spoken Language Understanding (SLU) falls short in at least one of three ways: models were trained on oracle text input and neglected the Automatics Speech Recognition (ASR) outputs, models were trained to predict only intents without the slot values, or models were trained on a large amount of in-house data.
no code implementations • 3 Aug 2020 • Monica Sunkara, Srikanth Ronanki, Dhanush Bekal, Sravan Bodapati, Katrin Kirchhoff
Experiments conducted on the Fisher corpus show that our proposed approach achieves ~6-9% and ~3-4% absolute improvement (F1 score) over the baseline BLSTM model on reference transcripts and ASR outputs respectively.
no code implementations • WS 2020 • Monica Sunkara, Srikanth Ronanki, Kalpit Dixit, Sravan Bodapati, Katrin Kirchhoff
We also present techniques for domain and task specific adaptation by fine-tuning masked language models with medical domain data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 2 Jan 2020 • Sahika Genc, Sunil Mallya, Sravan Bodapati, Tao Sun, Yunzhe Tao
Simulation-to-simulation and simulation-to-real world transfer of neural network models have been a difficult problem.
no code implementations • 13 Nov 2019 • Sunil Mallya, Marc Overhage, Sravan Bodapati, Navneet Srivastava, Sahika Genc
Chronic disease progression is emerging as an important area of investment for healthcare providers.
no code implementations • WS 2019 • Sravan Bodapati, Hyokun Yun, Yaser Al-Onaizan
Robustness to capitalization errors is a highly desirable characteristic of named entity recognizers, yet we find standard models for the task are surprisingly brittle to such noise.