no code implementations • 26 Aug 2022 • Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi
In this paper, we propose a model to perform style transfer of speech to singing voice.
1 code implementation • 27 Jun 2022 • Debottam Dutta, Debarpan Bhattacharya, Sriram Ganapathy, Amir H. Poorjam, Deepak Mittal, Maneesh Singh
In this paper, we describe an approach for representation learning of audio signals for the task of COVID-19 detection.
no code implementations • 24 Jun 2022 • Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan
The COVID-19 outbreak resulted in multiple waves of infections that have been associated with different SARS-CoV-2 variants.
no code implementations • 11 Jun 2022 • Deepak Mittal, Amir H. Poorjam, Debottam Dutta, Debarpan Bhattacharya, Zemin Yu, Sriram Ganapathy, Maneesh Singh
This report describes the system used for detecting COVID-19 positives using three different acoustic modalities, namely speech, breathing, and cough in the second DiCOVA challenge.
1 code implementation • 9 Jun 2022 • Debarpan Bhattacharya, Debottam Dutta, Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Pravin Mote, Sriram Ganapathy, Chandrakiran C, Sahiti Nori, Suhail K K, Sadhana Gonuguntla, Murali Alagesan
The COVID-19 pandemic has accelerated research on design of alternative, quick and effective COVID-19 diagnosis approaches.
no code implementations • 4 Oct 2021 • Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Debarpan Bhattacharya, Debottam Dutta, Pravin Mote, Sriram Ganapathy
This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals.
1 code implementation • 14 Sep 2021 • Prachi Singh, Sriram Ganapathy
In this paper, we propose an approach that jointly learns the speaker embeddings and the similarity metric using principles of self-supervised learning.
no code implementations • 12 Aug 2021 • Anurenjan Purushothaman, Anirudh Sreeram, Rohit Kumar, Sriram Ganapathy
The dereverberated envelopes are used for feature extraction in speech recognition.
1 code implementation • 9 Aug 2021 • Rohit Kumar, Anurenjan Purushothaman, Anirudh Sreeram, Sriram Ganapathy
In this paper, we develop a feature enhancement approach using a neural model operating on sub-band temporal envelopes.
no code implementations • 30 Jul 2021 • Debottam Dutta, Purvi Agrawal, Sriram Ganapathy
The relevance weighted representations are fed to a neural classifier and the whole system is trained jointly for the audio classification objective.
no code implementations • 24 Jun 2021 • R G Prithvi Raj, Rohit Kumar, M K Jayesh, Anurenjan Purushothaman, Sriram Ganapathy, M A Basha Shaik
This paper presents the details of the SRIB-LEAP submission to the ConferencingSpeech challenge 2021.
no code implementations • 21 Jun 2021 • Neeraj Kumar Sharma, Ananya Muguli, Prashant Krishnan, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy
As part of the challenge, datasets with breathing, cough, and speech sound samples from COVID-19 and non-COVID-19 individuals were released to the participants.
1 code implementation • 1 Jun 2021 • Srikanth Raj Chetupalli, Prashant Krishnan, Neeraj Sharma, Ananya Muguli, Rohit Kumar, Viral Nanda, Lancelot Mark Pinto, Prasanta Kumar Ghosh, Sriram Ganapathy
The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic.
no code implementations • 18 May 2021 • Jaswanth Reddy Katthi, Sriram Ganapathy
A deep model is proposed for intra-subject audio-EEG analysis based on directly optimizing the correlation loss.
1 code implementation • 19 Apr 2021 • Prachi Singh, Sriram Ganapathy
In this paper, we propose a representation learning and clustering algorithm that can be iteratively performed for improved speaker diarization.
no code implementations • 6 Apr 2021 • Prachi Singh, Rajat Varma, Venkat Krishnamohan, Srikanth Raj Chetupalli, Sriram Ganapathy
This paper describes the challenge submission, the post-evaluation analysis and improvements observed on the DIHARD-III dataset.
no code implementations • 5 Apr 2021 • Srikanth Raj Chetupalli, Sriram Ganapathy
The proposed model is a combination of a speaker diarization system and a hybrid automatic speech recognition (ASR) system.
no code implementations • 16 Mar 2021 • Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda
The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning.
no code implementations • 11 Mar 2021 • Jaswanth Reddy Katthi, Sriram Ganapathy
The experiments are performed on EEG data collected from subjects listening to natural speech and music.
1 code implementation • 17 Feb 2021 • Sakya Basak, Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi
This approach, called voice to singing (V2S), performs the voice style conversion by modulating the F0 contour of the natural speech with that of a singing voice.
2 code implementations • 2 Dec 2020 • Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Liberman
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain.
1 code implementation • 11 Aug 2020 • Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy
Recently, we had proposed a neural network approach for backend modeling in speaker verification called the neural PLDA (NPLDA) where the likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost.
1 code implementation • 10 Aug 2020 • Prachi Singh, Sriram Ganapathy
In this paper, we propose a novel algorithm for hierarchical clustering which combines the speaker clustering along with a representation learning framework.
Audio and Speech Processing
no code implementations • 7 Aug 2020 • Anurenjan Purushothaman, Anirudh Sreeram, Rohit Kumar, Sriram Ganapathy
Automatic speech recognition in reverberant conditions is a challenging task as the long-term envelopes of the reverberant speech are temporally smeared.
1 code implementation • 12 Jul 2020 • Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, Ragesh Rajan M, Prashant Krishnan
The metadata information for speaker profiling applications like linguistic information, regional information, and physical characteristics of a speaker are also collected.
no code implementations • 2 Apr 2020 • Bharat Padi, Anand Mohan, Sriram Ganapathy
In particular, a new model is proposed for incorporating relevance in language recognition, where parts of speech data are weighted more based on their relevance for the language recognition task.
1 code implementation • 10 Feb 2020 • Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy
The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity function and the learnable parameters of the score function are optimized using a verification cost.
no code implementations • 7 Feb 2020 • Shreyas Ramoji, Prashant Krishnan, Bhargavram Mysore, Prachi Singh, Sriram Ganapathy
In this paper, we provide a detailed account of the LEAP SRE system submitted to the CTS challenge focusing on the novel components in the back-end system modeling.
1 code implementation • 20 Jan 2020 • Shreyas Ramoji, Prashant Krishnan V, Prachi Singh, Sriram Ganapathy
The pre-processing steps of linear discriminant analysis (LDA), unit length normalization and within class covariance normalization are all modeled as layers of a neural model and the speaker verification cost functions can be back-propagated through these layers during training.
1 code implementation • 29 Nov 2019 • Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Parthasaarathy Sudarsanam, Sriram Ganapathy, Yuki Mitsufuji
Despite recent advances in voice separation methods, many challenges remain in realistic scenarios such as noisy recording and the limits of available data.
no code implementations • 28 Nov 2019 • Rohit Kumar, Anirudh Sreeram, Anurenjan Purushothaman, Sriram Ganapathy
These models are trained using a paired corpus of clean and noisy recordings (teacher model).
no code implementations • 13 Nov 2019 • Anurenjan Purushothaman, Anirudh Sreeram, Sriram Ganapathy
The MAR features are fed to a convolutional neural network (CNN) architecture which performs the joint acoustic modeling on the three dimensions.
1 code implementation • 18 Jun 2019 • Neville Ryant, Kenneth Church, Christopher Cieri, Alejandrina Cristia, Jun Du, Sriram Ganapathy, Mark Liberman
This paper introduces the second DIHARD challenge, the second in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variation in recording equipment, noise conditions, and conversational domain.
no code implementations • 25 Dec 2017 • Aditya Siddhant, Preethi Jyothi, Sriram Ganapathy
The problem of automatic accent identification is important for several applications like speaker profiling and recognition as well as for improving speech recognition systems.
no code implementations • 5 May 2016 • Seyed Omid Sadjadi, Jason Pelecanos, Sriram Ganapathy
We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech.
no code implementations • 23 Feb 2016 • Seyed Omid Sadjadi, Sriram Ganapathy, Jason W. Pelecanos
In this paper we describe the recent advancements made in the IBM i-vector speaker recognition system for conversational speech.