Search Results for author: Preethi Jyothi

Found 48 papers, 16 papers with code

Synthesizing Audio for Hindi WordNet

no code implementations GWC 2018 Diptesh Kanojia, Preethi Jyothi, Pushpak Bhattacharyya

We also develop voices using the existing implementations of the aforementioned systems, and (2) We use these voices to generate sample audios for randomly chosen words; manually evaluate the audio generated, and produce audio for all WordNet words using the winner voice model.

Speech Synthesis

Zero-shot Disfluency Detection for Indian Languages

no code implementations COLING 2022 Rohit Kundu, Preethi Jyothi, Pushpak Bhattacharyya

We present a detailed pipeline to synthetically generate disfluent text and create evaluation datasets for four Indian languages: Bengali, Hindi, Malayalam, and Marathi.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Improving RNN-Transducers with Acoustic LookAhead

no code implementations11 Jul 2023 Vinit S. Unni, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi

RNN-Transducers (RNN-Ts) have gained widespread acceptance as an end-to-end model for speech to text conversion because of their high accuracy and streaming capabilities.

Towards Zero-Shot Code-Switched Speech Recognition

no code implementations2 Nov 2022 Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe

In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training

no code implementations30 Oct 2022 Ashish Mittal, Durga Sivasubramanian, Rishabh Iyer, Preethi Jyothi, Ganesh Ramakrishnan

Training state-of-the-art ASR systems such as RNN-T often has a high associated financial and environmental cost.

DICTDIS: Dictionary Constrained Disambiguation for Improved NMT

no code implementations13 Oct 2022 Ayush Maheshwari, Piyush Sharma, Preethi Jyothi, Ganesh Ramakrishnan

In this work we present \dictdis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries.

Machine Translation NMT

Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding

no code implementations ACL 2022 Soumya Chatterjee, Sunita Sarawagi, Preethi Jyothi

Online alignment in machine translation refers to the task of aligning a target word to a source word when the target sequence has only been partially decoded.

Machine Translation Translation

Investigating Modality Bias in Audio Visual Video Parsing

no code implementations31 Mar 2022 Piyush Singh Pasi, Shubham Nemani, Preethi Jyothi, Ganesh Ramakrishnan

We focus on the audio-visual video parsing (AVVP) problem that involves detecting audio and visual event labels with temporal boundaries.

Error Correction in ASR using Sequence-to-Sequence Models

no code implementations2 Feb 2022 Samrat Dutta, Shreyansh Jain, Ayush Maheshwari, Souvik Pal, Ganesh Ramakrishnan, Preethi Jyothi

Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation

no code implementations10 Oct 2021 Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi

To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn) that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding

no code implementations EMNLP (MRL) 2021 Archiki Prasad, Mohammad Ali Rehan, Shreya Pathak, Preethi Jyothi

In this work, we propose the use of bilingual intermediate pretraining as a reliable technique to derive large and consistent performance gains on three different NLP tasks using code-switched text.

Language Modelling Natural Language Inference +4

From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text

1 code implementation ACL 2021 Ishan Tarunesh, Syamantak Kumar, Preethi Jyothi

Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large volumes of real code-switched text.

Data Augmentation Language Modelling +3

Cross-Modal learning for Audio-Visual Video Parsing

1 code implementation3 Apr 2021 Jatin Lamba, abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan

In this paper, we present a novel approach to the audio-visual video parsing (AVVP) task that demarcates events from a video separately for audio and visual modalities.

Event Detection Multiple Instance Learning +1

Collaborative Learning to Generate Audio-Video Jointly

no code implementations1 Apr 2021 Vinod K Kurmi, Vipul Bajaj, Badri N Patro, K S Venkatesh, Vinay P Namboodiri, Preethi Jyothi

Towards this, we propose a method that demonstrates that we are able to generate naturalistic samples of video and audio data by the joint correlated generation of audio and video modalities.

Rudder: A Cross Lingual Video and Text Retrieval Dataset

1 code implementation9 Mar 2021 Jayaprakash A, abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi

Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input.

Natural Language Queries Retrieval +2

Error-driven Fixed-Budget ASR Personalization for Accented Speakers

1 code implementation4 Mar 2021 Abhijeet Awasthi, Aman Kansal, Sunita Sarawagi, Preethi Jyothi

We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances.

Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages

no code implementations19 Oct 2020 Anuj Diwan, Preethi Jyothi

This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages.

Speech Recognition

Improving Low Resource Code-switched ASR using Augmented Code-switched TTS

no code implementations12 Oct 2020 Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi

Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems

no code implementations ACL 2020 Archiki Prasad, Preethi Jyothi

We use a state-of-the-art end-to-end ASR system, comprising convolutional and recurrent layers, that is trained on a large amount of US-accented English speech and evaluate the model on speech samples from seven different English accents.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Black-box Adaptation of ASR for Accented Speech

1 code implementation24 Jun 2020 Kartik Khandelwal, Preethi Jyothi, Abhijeet Awasthi, Sunita Sarawagi

Accordingly, we propose a novel coupling of an open-source accent-tuned local model with the black-box service where the output from the service guides frame-level inference in the local model.

Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition

1 code implementation14 May 2020 Vinit Unni, Nitish Joshi, Preethi Jyothi

We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents.

Accented Speech Recognition Automatic Speech Recognition +2

Stem-driven Language Models for Morphologically Rich Languages

no code implementations25 Oct 2019 Yash Shah, Ishan Tarunesh, Harsh Deshpande, Preethi Jyothi

Neural language models (LMs) have shown to benefit significantly from enhancing word vectors with subword-level information, especially for morphologically rich languages.

Multi-Task Learning

End-to-End ASR for Code-switched Hindi-English Speech

no code implementations22 Jun 2019 Brij Mohan Lal Srivastava, Basil Abraham, Sunayana Sitaram, Rupesh Mehta, Preethi Jyothi

While the lack of data adversely affects the performance of end-to-end models, we see promising improvements with MTL and balancing the corpus.

Multi-Task Learning

Revisiting the Importance of Encoding Logic Rules in Sentiment Classification

1 code implementation EMNLP 2018 Kalpesh Krishna, Preethi Jyothi, Mohit Iyyer

We analyze the performance of different sentiment classification models on syntactically complex inputs like A-but-B sentences.

Classification General Classification +2

Leveraging Native Language Speech for Accent Identification using Deep Siamese Networks

no code implementations25 Dec 2017 Aditya Siddhant, Preethi Jyothi, Sriram Ganapathy

The problem of automatic accent identification is important for several applications like speaker profiling and recognition as well as for improving speech recognition systems.

Speaker Profiling speech-recognition +1

Dual Language Models for Code Switched Speech Recognition

no code implementations3 Nov 2017 Saurabh Garg, Tanmay Parekh, Preethi Jyothi

Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

no code implementations13 Dec 2016 Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson

Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language. Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages.

Cross-Lingual ASR

Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR

no code implementations WS 2016 Wenda Chen, Mark Hasegawa-Johnson, Nancy Chen, Preethi Jyothi, Lav Varshney

We evaluate our techniques using mismatched transcriptions for Cantonese speech acquired from native English and Mandarin speakers.


Cannot find the paper you are looking for? You can Submit a new open access paper.