Search Results for author: Rishabh Jain

Found 23 papers, 10 papers with code

Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys

no code implementations • 18 Nov 2023 • Gabriel Cosache, Francisco Salgado, Cosmin Rotariu, George Sterpu, Rishabh Jain, Peter Corcoran

An overview is given of the DAVID Smart-Toy platform, one of the first Edge AI platform designs to incorporate advanced low-power data processing by neural inference models co-located with the relevant image or audio sensors.

Paper
Add Code

Synthetic Speaking Children -- Why We Need Them and How to Make Them

no code implementations • 8 Nov 2023 • Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran

Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.

Paper
Add Code

A comparative analysis between Conformer-Transducer, Whisper, and wav2vec2 for improving the child speech recognition

1 code implementation • 7 Nov 2023 • Andrei Barcovschi, Rishabh Jain, Peter Corcoran

We demonstrate that finetuning Conformer-transducer models on child speech yields significant improvements in ASR performance on child speech, compared to the non-finetuned models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning

1 code implementation • 7 Nov 2023 • Rishabh Jain, Peter Corcoran

The approach involved finetuning a multi-speaker TTS model to work with child speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

MAGIC-TBR: Multiview Attention Fusion for Transformer-based Bodily Behavior Recognition in Group Settings

1 code implementation • 19 Sep 2023 • Surbhi Madan, Rishabh Jain, Gulshan Sharma, Ramanathan Subramanian, Abhinav Dhall

Bodily behavioral language is an important social cue, and its automated analysis helps in enhancing the understanding of artificial intelligence systems.

Pose Estimation

Paper
Code

Automatic Concept Embedding Model (ACEM): No train-time concepts, No issue!

no code implementations • 7 Sep 2023 • Rishabh Jain

Interpretability and explainability of neural networks is continuously increasing in importance, especially within safety-critical domains and to provide the social right to explanation.

Paper
Add Code

Adaptation of Whisper models to child speech recognition

1 code implementation • 24 Jul 2023 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Peter Corcoran, Horia Cucu

We demonstrate that finetuning Whisper on child speech yields significant improvements in ASR performance on child speech, compared to non finetuned Whisper models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Neural Priority Queues for Graph Neural Networks

no code implementations • 18 Jul 2023 • Rishabh Jain, Petar Veličković, Pietro Liò

Graph Neural Networks (GNNs) have shown considerable success in neural algorithmic reasoning.

Ranked #20 on Graph Regression on Peptides-struct

Graph Regression

Paper
Add Code

FODVid: Flow-guided Object Discovery in Videos

no code implementations • 10 Jul 2023 • Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy

Segmentation of objects in a video is challenging due to the nuances such as motion blurring, parallax, occlusions, changes in illumination, etc.

Object Object Discovery +5

Paper
Add Code

Parameter Efficient Local Implicit Image Function Network for Face Segmentation

no code implementations • CVPR 2023 • Mausoom Sarkar, Nikitha SR, Mayur Hemani, Rishabh Jain, Balaji Krishnamurthy

Face parsing is defined as the per-pixel labeling of images containing human faces.

Face Parsing Segmentation

Paper
Add Code

UMFuse: Unified Multi View Fusion for Human Editing applications

no code implementations • ICCV 2023 • Rishabh Jain, Mayur Hemani, Duygu Ceylan, Krishna Kumar Singh, Jingwan Lu, Mausoom Sarkar, Balaji Krishnamurthy

Numerous pose-guided human editing methods have been explored by the vision community due to their extensive practical applications.

Image Generation Retrieval +1

Paper
Add Code

VGFlow: Visibility guided Flow Network for Human Reposing

no code implementations • CVPR 2023 • Rishabh Jain, Krishna Kumar Singh, Mayur Hemani, Jingwan Lu, Mausoom Sarkar, Duygu Ceylan, Balaji Krishnamurthy

The task of human reposing involves generating a realistic image of a person standing in an arbitrary conceivable pose.

SSIM

Paper
Add Code

Extending Logic Explained Networks to Text Classification

1 code implementation • 4 Nov 2022 • Rishabh Jain, Gabriele Ciravegna, Pietro Barbiero, Francesco Giannini, Davide Buffelli, Pietro Lio

Recently, Logic Explained Networks (LENs) have been proposed as explainable-by-design neural models providing logic explanations for their predictions.

text-classification Text Classification

Paper
Code

Analysis of Distributed Deep Learning in the Cloud

no code implementations • 30 Aug 2022 • Aakash Sharma, Vivek M. Bhasi, Sonali Singh, Rishabh Jain, Jashwant Raj Gunasekaran, Subrata Mitra, Mahmut Taylan Kandemir, George Kesidis, Chita R. Das

We aim to resolve this problem by introducing a comprehensive distributed deep learning (DDL) profiler, which can determine the various execution "stalls" that DDL suffers from while running on a public cloud.

Paper
Add Code

A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition

no code implementations • 6 Apr 2022 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu

Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis

no code implementations • 22 Mar 2022 • Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu

Speech synthesis has come a long way as current text-to-speech (TTS) models can now generate natural human-sounding speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

ZFlow: Gated Appearance Flow-based Virtual Try-on with 3D Priors

no code implementations • ICCV 2021 • Ayush Chopra, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy

Image-based virtual try-on involves synthesizing perceptually convincing images of a model wearing a particular garment and has garnered significant research interest due to its immense practical applicability.

SSIM Virtual Try-on

Paper
Add Code

Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data

1 code implementation • NeurIPS 2020 • Michael Cogswell, Jiasen Lu, Rishabh Jain, Stefan Lee, Devi Parikh, Dhruv Batra

Can we develop visually grounded dialog agents that can efficiently adapt to new tasks without forgetting how to talk to people?

Visual Dialog Visual Question Answering (VQA)

Paper
Code

On Model Stability as a Function of Random Seed

1 code implementation • CONLL 2019 • Pranava Madhyastha, Rishabh Jain

In this paper, we focus on quantifying model stability as a function of random seed by investigating the effects of the induced randomness on model performance and the robustness of the model in general.

counterfactual

Paper
Code

Model Explanations under Calibration

1 code implementation • 18 Jun 2019 • Rishabh Jain, Pranava Madhyastha

Explaining and interpreting the decisions of recommender systems are becoming extremely relevant both, for improving predictive performance, and providing valid explanations to users.

Recommendation Systems valid

Paper
Code

EvalAI: Towards Better Evaluation Systems for AI Agents

3 code implementations • 10 Feb 2019 • Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee, Dhruv Batra

We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale.

Benchmarking BIG-bench Machine Learning

1,682

Paper
Code

nocaps: novel object captioning at scale

2 code implementations • ICCV 2019 • Harsh Agrawal, Karan Desai, YuFei Wang, Xinlei Chen, Rishabh Jain, Mark Johnson, Dhruv Batra, Devi Parikh, Stefan Lee, Peter Anderson

To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task.

Image Captioning Object +2

Paper
Code

Experiments on Morphological Reinflection: CoNLL-2018 Shared Task

no code implementations • CONLL 2018 • Rishabh Jain, Anil Kumar Singh

Machine Translation Morphological Inflection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.