no code implementations • 1 Dec 2023 • Abhayjeet Singh, Charu Shah, Rajashri Varadaraj, Sonakshi Chauhan, Prasanta Kumar Ghosh
Transcripts for 23 hours is generated and validated which can serve as a spontaneous speech ASR benchmark.
no code implementations • 13 Oct 2023 • Jesuraj Bandekar, Sathvik Udupa, Abhayjeet Singh, Anjali Jayakumar, Deekshitha G, Sandhya Badiger, Saurabh Kumar, Pooja VH, Prasanta Kumar Ghosh
With the advent of high-quality speech synthesis, there is a lot of interest in controlling various prosodic attributes of speech.
no code implementations • 16 Jul 2023 • Abhayjeet Singh, Arjun Singh Mehta, Ashish Khuraishi K S, Deekshitha G, Gauri Date, Jai Nanavati, Jesuraja Bandekar, Karnalius Basumatary, Karthika P, Sandhya Badiger, Sathvik Udupa, Saurabh Kumar, Savitha, Prasanta Kumar Ghosh, Prashanthi V, Priyanka Pai, Raoul Nanavati, Rohan Saxena, Sai Praneeth Reddy Mora, Srinivasa Raghavan
This is where a lot of adaptation and fine-tuning techniques can be applied to overcome the low-resource nature of the data by utilising well-resourced similar languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 7 Apr 2023 • Shivani Yadav, Dipanjan Gope, Uma Maheswari K., Prasanta Kumar Ghosh
Dynamic programming with the prior information of the number of breath phases($P$) and breath phase duration($d$) is used to find the boundaries.
1 code implementation • 11 Nov 2022 • Mohammad Shaique Solanki, Ashutosh M Bharadwaj, Jeevan K, Prasanta Kumar Ghosh
In this study, we explore the use of data-driven and knowledge-based features from vocal breath sounds as well as the classifier complexity for gender classification.
1 code implementation • 30 Oct 2022 • Sathvik Udupa, Prasanta Kumar Ghosh
Real-Time Magnetic resonance imaging (rtMRI) of the midsagittal plane of the mouth is of interest for speech production research.
1 code implementation • 30 Oct 2022 • Sathvik Udupa, Siddarth C, Prasanta Kumar Ghosh
In this work, we investigate the effectiveness of pretrained Self-Supervised Learning (SSL) features for learning the mapping for acoustic to articulatory inversion (AAI).
no code implementations • 9 Mar 2022 • Anwesha Roy, Varun Belagali, Prasanta Kumar Ghosh
We also propose two new evaluation metrics for ATB segmentation separately in contour1 and contour2 to explicitly capture two types of errors in these contours.
no code implementations • 8 Dec 2021 • Abhayjeet Singh, Achuth Rao MV, Rakesh Vaideeswaran, Chiranjeevi Yarra, Prasanta Kumar Ghosh
We observe that the sentence, speaker difficulty ratings and the WERs increase from easy to hard categories of sentences.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 1 Jun 2021 • Srikanth Raj Chetupalli, Prashant Krishnan, Neeraj Sharma, Ananya Muguli, Rohit Kumar, Viral Nanda, Lancelot Mark Pinto, Prasanta Kumar Ghosh, Sriram Ganapathy
The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic.
1 code implementation • 11 Apr 2021 • Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh
Additionally, on the AAI task, we obtain 1. 5%, 3% and 3. 1% relative gain in CC on the same setups compared to the state-of-the-art baseline.
1 code implementation • 1 Apr 2021 • Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham
For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English.
no code implementations • 16 Mar 2021 • Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning.
no code implementations • 4 Jun 2020 • Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh
As the range of articulatory motions is correlated with speaking rate, we also analyze amplitude of the transformed articulatory movements at different rates compared to their original counterparts, to examine how well the proposed AstNet predicts the extent of articulatory movements in N2F and N2S.
no code implementations • 31 Oct 2019 • Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh
While an attention network is used for estimating articulatory movement in the case of R2, BLSTM network is used for R1 and R3.