no code implementations • 18 Nov 2023 • Gabriel Cosache, Francisco Salgado, Cosmin Rotariu, George Sterpu, Rishabh Jain, Peter Corcoran
An overview is given of the DAVID Smart-Toy platform, one of the first Edge AI platform designs to incorporate advanced low-power data processing by neural inference models co-located with the relevant image or audio sensors.
no code implementations • 8 Nov 2023 • Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran
Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.
1 code implementation • 7 Nov 2023 • Andrei Barcovschi, Rishabh Jain, Peter Corcoran
We demonstrate that finetuning Conformer-transducer models on child speech yields significant improvements in ASR performance on child speech, compared to the non-finetuned models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 7 Nov 2023 • Rishabh Jain, Peter Corcoran
The approach involved finetuning a multi-speaker TTS model to work with child speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 19 Sep 2023 • Surbhi Madan, Rishabh Jain, Gulshan Sharma, Ramanathan Subramanian, Abhinav Dhall
Bodily behavioral language is an important social cue, and its automated analysis helps in enhancing the understanding of artificial intelligence systems.
no code implementations • 7 Sep 2023 • Rishabh Jain
Interpretability and explainability of neural networks is continuously increasing in importance, especially within safety-critical domains and to provide the social right to explanation.
1 code implementation • 24 Jul 2023 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Peter Corcoran, Horia Cucu
We demonstrate that finetuning Whisper on child speech yields significant improvements in ASR performance on child speech, compared to non finetuned Whisper models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 18 Jul 2023 • Rishabh Jain, Petar Veličković, Pietro Liò
Graph Neural Networks (GNNs) have shown considerable success in neural algorithmic reasoning.
Ranked #20 on Graph Regression on Peptides-struct
no code implementations • 10 Jul 2023 • Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy
Segmentation of objects in a video is challenging due to the nuances such as motion blurring, parallax, occlusions, changes in illumination, etc.
no code implementations • CVPR 2023 • Mausoom Sarkar, Nikitha SR, Mayur Hemani, Rishabh Jain, Balaji Krishnamurthy
Face parsing is defined as the per-pixel labeling of images containing human faces.
no code implementations • ICCV 2023 • Rishabh Jain, Mayur Hemani, Duygu Ceylan, Krishna Kumar Singh, Jingwan Lu, Mausoom Sarkar, Balaji Krishnamurthy
Numerous pose-guided human editing methods have been explored by the vision community due to their extensive practical applications.
no code implementations • CVPR 2023 • Rishabh Jain, Krishna Kumar Singh, Mayur Hemani, Jingwan Lu, Mausoom Sarkar, Duygu Ceylan, Balaji Krishnamurthy
The task of human reposing involves generating a realistic image of a person standing in an arbitrary conceivable pose.
1 code implementation • 4 Nov 2022 • Rishabh Jain, Gabriele Ciravegna, Pietro Barbiero, Francesco Giannini, Davide Buffelli, Pietro Lio
Recently, Logic Explained Networks (LENs) have been proposed as explainable-by-design neural models providing logic explanations for their predictions.
no code implementations • 30 Aug 2022 • Aakash Sharma, Vivek M. Bhasi, Sonali Singh, Rishabh Jain, Jashwant Raj Gunasekaran, Subrata Mitra, Mahmut Taylan Kandemir, George Kesidis, Chita R. Das
We aim to resolve this problem by introducing a comprehensive distributed deep learning (DDL) profiler, which can determine the various execution "stalls" that DDL suffers from while running on a public cloud.
no code implementations • 6 Apr 2022 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Mar 2022 • Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Speech synthesis has come a long way as current text-to-speech (TTS) models can now generate natural human-sounding speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • ICCV 2021 • Ayush Chopra, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy
Image-based virtual try-on involves synthesizing perceptually convincing images of a model wearing a particular garment and has garnered significant research interest due to its immense practical applicability.
1 code implementation • NeurIPS 2020 • Michael Cogswell, Jiasen Lu, Rishabh Jain, Stefan Lee, Devi Parikh, Dhruv Batra
Can we develop visually grounded dialog agents that can efficiently adapt to new tasks without forgetting how to talk to people?
1 code implementation • CONLL 2019 • Pranava Madhyastha, Rishabh Jain
In this paper, we focus on quantifying model stability as a function of random seed by investigating the effects of the induced randomness on model performance and the robustness of the model in general.
1 code implementation • 18 Jun 2019 • Rishabh Jain, Pranava Madhyastha
Explaining and interpreting the decisions of recommender systems are becoming extremely relevant both, for improving predictive performance, and providing valid explanations to users.
3 code implementations • 10 Feb 2019 • Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee, Dhruv Batra
We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale.
2 code implementations • ICCV 2019 • Harsh Agrawal, Karan Desai, YuFei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson
To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task.