no code implementations • NAACL (sdp) 2021 • Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi
Large pretrained models have seen enormous success in extractive summarization tasks.
no code implementations • COLING 2022 • Rohit Kundu, Preethi Jyothi, Pushpak Bhattacharyya
We present a detailed pipeline to synthetically generate disfluent text and create evaluation datasets for four Indian languages: Bengali, Hindi, Malayalam, and Marathi.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • COLING 2022 • Barah Fazili, Preethi Jyothi
Multilingual pretrained models, while effective on monolingual data, need additional training to work well with code-switched text.
no code implementations • GWC 2018 • Hanumant Redkar, Rajita Shukla, Sandhya Singh, Jaya Saraswati, Laxmi Kashyap, Diptesh Kanojia, Preethi Jyothi, Malhar Kulkarni, Pushpak Bhattacharyya
This aid is based on modern pedagogical axioms and is aligned to the learning objectives of the syllabi of the school education in India.
no code implementations • GWC 2018 • Diptesh Kanojia, Preethi Jyothi, Pushpak Bhattacharyya
We also develop voices using the existing implementations of the aforementioned systems, and (2) We use these voices to generate sample audios for randomly chosen words; manually evaluate the audio generated, and produce audio for all WordNet words using the winner voice model.
no code implementations • 17 Oct 2024 • Abhishek Gupta, Amruta Parulekar, Sameep Chattopadhyay, Preethi Jyothi
Automatic speech recognition (ASR) for low-resource languages remains a challenge due to the scarcity of labeled training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 29 Aug 2024 • Ashish Mittal, Darshan Prabhu, Sunita Sarawagi, Preethi Jyothi
A challenge of our proposed coupling is handling the mismatch between the tokenizers of the LLM and ASR systems.
1 code implementation • 16 Jul 2024 • Sona Elza Simon, Soumen Kumar Mondal, Abhishek Singhania, Sayambhu Sen, Preethi Jyothi
Large language models (LLMs) encode vast amounts of world knowledge acquired via training on large web-scale datasets crawled from the internet.
1 code implementation • 15 Jul 2024 • Barah Fazili, Ashish Sunil Agrawal, Preethi Jyothi
Large language models (LLMs) are very proficient text generators.
no code implementations • 8 Jul 2024 • Krishnakant Bhatt, Karthika N J, Ganesh Ramakrishnan, Preethi Jyothi
Subword tokens in Indian languages inherently carry meaning, and isolating them can enhance NLP tasks, making sub-word segmentation a crucial process.
1 code implementation • 4 Jul 2024 • Darshan Prabhu, Yifan Peng, Preethi Jyothi, Shinji Watanabe
Convolutions have become essential in state-of-the-art end-to-end Automatic Speech Recognition~(ASR) systems due to their efficient modelling of local context.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 4 Jul 2024 • Darshan Prabhu, Abhishek Gupta, Omkar Nitsure, Preethi Jyothi, Sriram Ganapathy
Speech accents present a serious challenge to the performance of state-of-the-art end-to-end Automatic Speech Recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 16 Jun 2024 • Bhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya
Code-switching is a widely prevalent linguistic phenomenon in multilingual societies like India.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 18 May 2024 • Ayush Maheshwari, Atul Kumar Singh, Karthika NJ, Krishnakant Bhatt, Preethi Jyothi, Ganesh Ramakrishnan
Owing to the research gap in lexicon generation, especially with a limited focus on the domain-specific area, we propose a new model to generate dictionary words for 6 Indian languages in the multi-domain setting.
no code implementations • 12 Mar 2024 • Yash Sharma, Basil Abraham, Preethi Jyothi
An important and difficult task in code-switched speech recognition is to recognize the language, as lots of words in two languages can sound similar, especially in some accents.
1 code implementation • 3 Feb 2024 • Ashish Sunil Agrawal, Barah Fazili, Preethi Jyothi
Popular benchmarks (e. g., XNLI) used to evaluate cross-lingual language understanding consist of parallel versions of English evaluation sets in multiple target languages created with the help of professional translators.
1 code implementation • 25 Oct 2023 • Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya
Towards the goal of multilingual disfluency correction, we present a high-quality human-annotated DC corpus covering four important Indo-European languages: English, Hindi, German and French.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 24 Oct 2023 • Darshan Prabhu, Preethi Jyothi, Sriram Ganapathy, Vinit Unni
In this work, we propose a novel accent adaptation approach for end-to-end ASR systems using cross-attention with a trainable set of codebooks.
1 code implementation • 10 Oct 2023 • Piyush Singh Pasi, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra, Manoj Singh
The problem of audio-to-text alignment has seen significant amount of research using complete supervision during training.
no code implementations • 11 Jul 2023 • Vinit S. Unni, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi
RNN-Transducers (RNN-Ts) have gained widespread acceptance as an end-to-end model for speech to text conversion because of their high accuracy and streaming capabilities.
1 code implementation • 10 Jun 2023 • Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya
Disfluencies commonly occur in conversational speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 26 May 2023 • Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya
Conversational speech often consists of deviations from the speech plan, producing disfluent utterances that affect downstream NLP tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 2 Nov 2022 • Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe
In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 30 Oct 2022 • Ashish Mittal, Durga Sivasubramanian, Rishabh Iyer, Preethi Jyothi, Ganesh Ramakrishnan
Training state-of-the-art ASR systems such as RNN-T often has a high associated financial and environmental cost.
no code implementations • 13 Oct 2022 • Ayush Maheshwari, Preethi Jyothi, Ganesh Ramakrishnan
In this work we present DictDis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries.
no code implementations • ACL 2022 • Soumya Chatterjee, Sunita Sarawagi, Preethi Jyothi
Online alignment in machine translation refers to the task of aligning a target word to a source word when the target sequence has only been partially decoded.
no code implementations • 31 Mar 2022 • Piyush Singh Pasi, Shubham Nemani, Preethi Jyothi, Ganesh Ramakrishnan
We focus on the audio-visual video parsing (AVVP) problem that involves detecting audio and visual event labels with temporal boundaries.
no code implementations • 21 Feb 2022 • Vinit Unni, Shreya Khare, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi, Samarth Bharadwaj
RNN-Transducer (RNN-T) models have become synonymous with streaming end-to-end ASR systems.
no code implementations • 2 Feb 2022 • Samrat Dutta, Shreyansh Jain, Ayush Maheshwari, Souvik Pal, Ganesh Ramakrishnan, Preethi Jyothi
Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 10 Oct 2021 • Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi
To address this problem, we propose DITTO (Data-efficient and faIr Targeted subseT selectiOn) that uses Submodular Mutual Information (SMI) functions as acquisition functions to find the most informative set of utterances matching a target accent within a fixed budget.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • IJCAI 2021 • Arjit Jain, Pranay Reddy Samala, Preethi Jyothi, Deepak Mittal, Maneesh Singh
The original algorithm relies on computationally expensive data augmentation steps that involve perturbing the raw images and computing features for each perturbed image.
Image Augmentation Semi Supervised Learning for Image Captioning
no code implementations • EMNLP (MRL) 2021 • Archiki Prasad, Mohammad Ali Rehan, Shreya Pathak, Preethi Jyothi
In this work, we propose the use of bilingual intermediate pretraining as a reliable technique to derive large and consistent performance gains on three different NLP tasks using code-switched text.
1 code implementation • ACL 2021 • Ishan Tarunesh, Syamantak Kumar, Preethi Jyothi
Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large volumes of real code-switched text.
1 code implementation • Findings (ACL) 2021 • Devaraja Adiga, Rishabh Kumar, Amrith Krishna, Preethi Jyothi, Ganesh Ramakrishnan, Pawan Goyal
In this work, we propose the first large scale study of automatic speech recognition (ASR) in Sanskrit, with an emphasis on the impact of unit selection in Sanskrit ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 3 Apr 2021 • Jatin Lamba, abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan
In this paper, we present a novel approach to the audio-visual video parsing (AVVP) task that demarcates events from a video separately for audio and visual modalities.
Ranked #1 on Event Detection on Audio Set
1 code implementation • 1 Apr 2021 • Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham
For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English.
no code implementations • EACL 2021 • Nikhil Saini, Drumil Trivedi, Shreya Khare, Tejas Dhamecha, Preethi Jyothi, Samarth Bharadwaj, Pushpak Bhattacharyya
Spoken language is different from the written language in its style and structure.
no code implementations • 1 Apr 2021 • Vinod K Kurmi, Vipul Bajaj, Badri N Patro, K S Venkatesh, Vinay P Namboodiri, Preethi Jyothi
Towards this, we propose a method that demonstrates that we are able to generate naturalistic samples of video and audio data by the joint correlated generation of audio and video modalities.
1 code implementation • 9 Mar 2021 • Aman Jain, Mayank Kothyari, Vishwajeet Kumar, Preethi Jyothi, Ganesh Ramakrishnan, Soumen Chakrabarti
In response, we identify a key structural idiom in OKVQA , viz., S3 (select, substitute and search), and build a new data set and challenge around it.
1 code implementation • 9 Mar 2021 • Jayaprakash A, abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi
Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input.
Ranked #1 on Video Retrieval on Charades-STA
1 code implementation • 4 Mar 2021 • Abhijeet Awasthi, Aman Kansal, Sunita Sarawagi, Preethi Jyothi
We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances.
1 code implementation • 11 Feb 2021 • Archiki Prasad, Preethi Jyothi, Rajbabu Velmurugan
A systematic comparison of these two approaches for end-to-end robust ASR has not been attempted before.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • EACL 2021 • Ishan Tarunesh, Sushil Khyalia, Vishwajeet Kumar, Ganesh Ramakrishnan, Preethi Jyothi
We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
no code implementations • 19 Oct 2020 • Anuj Diwan, Preethi Jyothi
This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages.
no code implementations • 12 Oct 2020 • Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi
Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • WS 2020 • Nikhil Saini, Jyotsana Khatri, Preethi Jyothi, Pushpak Bhattacharyya
We also make use of additional fluent text in the target language to help generate fluent translations.
no code implementations • ACL 2020 • Archiki Prasad, Preethi Jyothi
We use a state-of-the-art end-to-end ASR system, comprising convolutional and recurrent layers, that is trained on a large amount of US-accented English speech and evaluate the model on speech samples from seven different English accents.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 24 Jun 2020 • Kartik Khandelwal, Preethi Jyothi, Abhijeet Awasthi, Sunita Sarawagi
Accordingly, we propose a novel coupling of an open-source accent-tuned local model with the black-box service where the output from the service guides frame-level inference in the local model.
1 code implementation • 14 May 2020 • Vinit Unni, Nitish Joshi, Preethi Jyothi
We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents.
no code implementations • 25 Oct 2019 • Yash Shah, Ishan Tarunesh, Harsh Deshpande, Preethi Jyothi
Neural language models (LMs) have shown to benefit significantly from enhancing word vectors with subword-level information, especially for morphologically rich languages.
no code implementations • 22 Jun 2019 • Brij Mohan Lal Srivastava, Basil Abraham, Sunayana Sitaram, Rupesh Mehta, Preethi Jyothi
While the lack of data adversely affects the performance of end-to-end models, we see promising improvements with MTL and balancing the corpus.
1 code implementation • ACL 2019 • Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee, Ganesh Ramakrishnan, Preethi Jyothi
For a new language, such training instances are hard to obtain making the QG problem even more challenging.
no code implementations • EMNLP 2018 • Saurabh Garg, Tanmay Parekh, Preethi Jyothi
This work focuses on building language models (LMs) for code-switched text.
1 code implementation • EMNLP 2018 • Kalpesh Krishna, Preethi Jyothi, Mohit Iyyer
We analyze the performance of different sentiment classification models on syntactically complex inputs like A-but-B sentences.
1 code implementation • ICLR 2018 • Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, Sunita Sarawagi
We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains.
Ranked #93 on Domain Generalization on PACS
no code implementations • ICLR 2018 • Prannay Khosla, Preethi Jyothi, Vinay P. Namboodiri, Mukundhan Srinivasan
In this paper, we propose the generation of accented speech using generative adversarial networks.
no code implementations • 25 Dec 2017 • Aditya Siddhant, Preethi Jyothi, Sriram Ganapathy
The problem of automatic accent identification is important for several applications like speaker profiling and recognition as well as for improving speech recognition systems.
no code implementations • 3 Nov 2017 • Saurabh Garg, Tanmay Parekh, Preethi Jyothi
Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 13 Dec 2016 • Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson
Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language. Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages.
no code implementations • WS 2016 • Wenda Chen, Mark Hasegawa-Johnson, Nancy Chen, Preethi Jyothi, Lav Varshney
We evaluate our techniques using mismatched transcriptions for Cantonese speech acquired from native English and Mandarin speakers.