Search Results for author: Rohit Paturi

Found 5 papers, 1 papers with code

Generalized zero-shot audio-to-intent classification

no code implementations4 Nov 2023 Veera Raghavendra Elluru, Devang Kulshreshtha, Rohit Paturi, Sravan Bodapati, Srikanth Ronanki

Our multimodal training approach improves the accuracy of zero-shot intent classification on unseen intents of SLURP by 2. 75% and 18. 2% for the SLURP and internal goal-oriented dialog datasets, respectively, compared to audio-only training.

Classification Goal-Oriented Dialog +5

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation

1 code implementation1 Nov 2023 Juan Zuluaga-Gomez, Zhaocheng Huang, Xing Niu, Rohit Paturi, Sundararajan Srinivasan, Prashant Mathur, Brian Thompson, Marcello Federico

Conventional speech-to-text translation (ST) systems are trained on single-speaker utterances, and they may not generalize to real-life scenarios where the audio contains conversations by multiple speakers.

Automatic Speech Recognition speech-recognition +3

Speaker Diarization of Scripted Audiovisual Content

no code implementations4 Aug 2023 Yogesh Virkar, Brian Thompson, Rohit Paturi, Sundararajan Srinivasan, Marcello Federico

The media localization industry usually requires a verbatim script of the final film or TV production in order to create subtitles or dubbing scripts in a foreign language.

speaker-diarization Speaker Diarization +2

Cannot find the paper you are looking for? You can Submit a new open access paper.