no code implementations • 9 Nov 2022 • Rakesh Vaideeswaran, Feng Gao, Abhinav Mathur, Govind Thattai
Our method generates human-readable textual explanations while maintaining SOTA VQA accuracy on the GQA-REX (77. 49%) and VQA-E (71. 48%) datasets.
no code implementations • 8 Dec 2021 • Abhayjeet Singh, Achuth Rao MV, Rakesh Vaideeswaran, Chiranjeevi Yarra, Prasanta Kumar Ghosh
We observe that the sentence, speaker difficulty ratings and the WERs increase from easy to hard categories of sentences.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 1 Apr 2021 • Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham
For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English.