no code implementations • 4 Mar 2024 • Tahir Javed, Janki Atul Nawale, Eldho Ittan George, Sakshi Joshi, Kaushal Santosh Bhogale, Deovrat Mehendale, Ishvinder Virender Sethi, Aparna Ananthanarayanan, Hafsah Faquih, Pratiti Palit, Sneha Ravishankar, Saranya Sukumaran, Tripura Panchagnula, Sunjay Murali, Kunal Sharad Gandhi, Ambujavalli R, Manickam K M, C Venkata Vaijayanthi, Krishnan Srinivasa Raghavan Karunganni, Pratyush Kumar, Mitesh M Khapra
We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages.
1 code implementation • 24 May 2023 • Kaushal Santosh Bhogale, Sai Sundaresan, Abhigyan Raman, Tahir Javed, Mitesh M. Khapra, Pratyush Kumar
In this paper, we focus on Indian languages, and make the case that diverse benchmarks are required to evaluate and improve ASR systems for Indian languages.
no code implementations • 26 Aug 2022 • Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra
Significantly, we show that adding Shrutilipi to the training set of Wav2Vec models leads to an average decrease in WER of 5. 8\% for 7 languages on the IndicSUPERB benchmark.
Optical Character Recognition (OCR) Self-Supervised Learning +3
1 code implementation • 24 Aug 2022 • Tahir Javed, Kaushal Santosh Bhogale, Abhigyan Raman, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra
We hope IndicSUPERB contributes to the progress of developing speech language understanding models for Indian languages.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 6 Nov 2021 • Tahir Javed, Sumanth Doddapaneni, Abhigyan Raman, Kaushal Santosh Bhogale, Gowtham Ramesh, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra
Second, using this raw speech data we pretrain several variants of wav2vec style models for 40 Indian languages.