Search Results for author: Prasanta Kumar Ghosh

Found 15 papers, 6 papers with code

SPIRE-SIES: A Spontaneous Indian English Speech Corpus

no code implementations • 1 Dec 2023 • Abhayjeet Singh, Charu Shah, Rajashri Varadaraj, Sonakshi Chauhan, Prasanta Kumar Ghosh

Transcripts for 23 hours is generated and validated which can serve as a spontaneous speech ASR benchmark.

Action Detection Activity Detection

Paper
Add Code

Speaking rate attention-based duration prediction for speed control TTS

no code implementations • 13 Oct 2023 • Jesuraj Bandekar, Sathvik Udupa, Abhayjeet Singh, Anjali Jayakumar, Deekshitha G, Sandhya Badiger, Saurabh Kumar, Pooja VH, Prasanta Kumar Ghosh

With the advent of high-quality speech synthesis, there is a lot of interest in controlling various prosodic attributes of speech.

Attribute Speech Synthesis

Paper
Add Code

Model Adaptation for ASR in low-resource Indian Languages

no code implementations • 16 Jul 2023 • Abhayjeet Singh, Arjun Singh Mehta, Ashish Khuraishi K S, Deekshitha G, Gauri Date, Jai Nanavati, Jesuraja Bandekar, Karnalius Basumatary, Karthika P, Sandhya Badiger, Sathvik Udupa, Saurabh Kumar, Savitha, Prasanta Kumar Ghosh, Prashanthi V, Priyanka Pai, Raoul Nanavati, Rohan Saxena, Sai Praneeth Reddy Mora, Srinivasa Raghavan

This is where a lot of adaptation and fine-tuning techniques can be applied to overcome the low-resource nature of the data by utilising well-resourced similar languages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

An unsupervised segmentation of vocal breath sounds

no code implementations • 7 Apr 2023 • Shivani Yadav, Dipanjan Gope, Uma Maheswari K., Prasanta Kumar Ghosh

Dynamic programming with the prior information of the number of breath phases($P$) and breath phase duration($d$) is used to find the boundaries.

Boundary Detection

Paper
Add Code

Vocal Breath Sound Based Gender Classification

1 code implementation • 11 Nov 2022 • Mohammad Shaique Solanki, Ashutosh M Bharadwaj, Jeevan K, Prasanta Kumar Ghosh

In this study, we explore the use of data-driven and knowledge-based features from vocal breath sounds as well as the classifier complexity for gender classification.

Classification Gender Classification

Paper
Code

Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks

1 code implementation • 30 Oct 2022 • Sathvik Udupa, Prasanta Kumar Ghosh

Real-Time Magnetic resonance imaging (rtMRI) of the midsagittal plane of the mouth is of interest for speech production research.

Paper
Code

Improved acoustic-to-articulatory inversion using representations from pretrained self-supervised learning models

1 code implementation • 30 Oct 2022 • Sathvik Udupa, Siddarth C, Prasanta Kumar Ghosh

In this work, we investigate the effectiveness of pretrained Self-Supervised Learning (SSL) features for learning the mapping for acoustic to articulatory inversion (AAI).

Emotion Classification Self-Supervised Learning +3

Paper
Code

An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production

no code implementations • 9 Mar 2022 • Anwesha Roy, Varun Belagali, Prasanta Kumar Ghosh

We also propose two new evaluation metrics for ATB segmentation separately in contour1 and contour2 to explicitly capture two types of errors in these contours.

Dynamic Time Warping Segmentation

Paper
Add Code

A study on native American English speech recognition by Indian listeners with varying word familiarity level

no code implementations • 8 Dec 2021 • Abhayjeet Singh, Achuth Rao MV, Rakesh Vaideeswaran, Chiranjeevi Yarra, Prasanta Kumar Ghosh

We observe that the sentence, speaker difficulty ratings and the WERs increase from easy to hard categories of sentences.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms

1 code implementation • 1 Jun 2021 • Srikanth Raj Chetupalli, Prashant Krishnan, Neeraj Sharma, Ananya Muguli, Rohit Kumar, Viral Nanda, Lancelot Mark Pinto, Prasanta Kumar Ghosh, Sriram Ganapathy

The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic.

Paper
Code

Estimating articulatory movements in speech production with transformer networks

1 code implementation • 11 Apr 2021 • Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh

Additionally, on the AAI task, we obtain 1. 5%, 3% and 3. 1% relative gain in CC on the same setups compared to the state-of-the-art baseline.

Motion Estimation

Paper
Code

Multilingual and code-switching ASR challenges for low resource Indian languages

1 code implementation • 1 Apr 2021 • Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham

For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English.

Automatic Speech Recognition (ASR) Sentence

Paper
Code

DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

no code implementations • 16 Mar 2021 • Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda

The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning.

COVID-19 Diagnosis

Paper
Add Code

Attention and Encoder-Decoder based models for transforming articulatory movements at different speaking rates

no code implementations • 4 Jun 2020 • Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh

As the range of articulatory motions is correlated with speaking rate, we also analyze amplitude of the transformed articulatory movements at different rates compared to their original counterparts, to examine how well the proposed AstNet predicts the extent of articulatory movements in N2F and N2S.

Paper
Add Code

A comparative study of estimating articulatory movements from phoneme sequences and acoustic features

no code implementations • 31 Oct 2019 • Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh

While an attention network is used for estimating articulatory movement in the case of R2, BLSTM network is used for R1 and R3.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.