no code implementations • 6 Jan 2023 • David M. Chan, Shalini Ghosh, Ariya Rastrow, Björn Hoffmeister
Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning).
no code implementations • NAACL 2022 • Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell, Huijuan Xu
In this paper, we aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns that are unseen during training time, by leveraging the power of knowledge graphs.
no code implementations • 19 May 2022 • David M. Chan, Shalini Ghosh
Deep neural networks have largely demonstrated their ability to perform automated speech recognition (ASR) by extracting meaningful features from input audio frames.
no code implementations • 13 May 2022 • Soumyajit Mitra, Swayambhu Nath Ray, Bharat Padi, Arunasish Sen, Raghavendra Bilgi, Harish Arsikere, Shalini Ghosh, Ajay Srinivasamurthy, Sri Garimella
Modern Automatic Speech Recognition (ASR) systems often use a portfolio of domain-specific models in order to get high accuracy for distinct user utterance types across different devices.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 12 Oct 2021 • David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister
Traditionally, research in automated speech recognition has focused on local-first encoding of audio representations to predict the spoken phonemes in an utterance.
no code implementations • 5 Jun 2020 • Palash Goyal, Shalini Ghosh
We theoretically show that the proposed loss function is a tighter bound of 0-1 loss compared to any other loss satisfying the hierarchical constraints.
no code implementations • 7 Mar 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multi-modal machine learning (ML) models can process data in multiple modalities (e. g., video, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding, activity recognition).
no code implementations • 7 Feb 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multimodal ML models can process data in multiple modalities (e. g., video, images, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding).
no code implementations • 26 Mar 2019 • Dawei Li, Serafettin Tasci, Shalini Ghosh, Jingwen Zhu, Junting Zhang, Larry Heck
The key component of RILOD is a novel incremental learning algorithm that trains end-to-end for one-stage deep object detection models only using training data of new object classes.
no code implementations • 20 Mar 2019 • Jie Zhang, Junting Zhang, Shalini Ghosh, Dawei Li, Jingwen Zhu, Heming Zhang, Yalin Wang
Lifelong learning, the problem of continual learning where tasks arrive in sequence, has been lately attracting more attention in the computer vision community.
2 code implementations • 19 Mar 2019 • Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, C. -C. Jay Kuo
The idea is to first train a separate model only for the new classes, and then combine the two individual models trained on data of two distinct set of classes (old classes and new classes) via a novel double distillation training objective.
no code implementations • 26 Feb 2019 • Heming Zhang, Shalini Ghosh, Larry Heck, Stephen Walsh, Junting Zhang, Jie Zhang, C. -C. Jay Kuo
The key challenge of generative Visual Dialogue (VD) systems is to respond to human queries with informative answers in natural and contiguous conversation flow.
no code implementations • 15 Feb 2019 • Shalini Ghosh, Giedrius Burachas, Arijit Ray, Avi Ziskind
In this paper, we present a novel approach for the task of eXplainable Question Answering (XQA), i. e., generating natural language (NL) explanations for the Visual Question Answering (VQA) problem.
no code implementations • ICCV 2019 • Ramprasaath R. Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, Devi Parikh
Many vision and language models suffer from poor visual grounding - often falling back on easy-to-learn language priors rather than basing their decisions on visual concepts in the image.
no code implementations • 3 Feb 2019 • Jie Zhang, Xiaolong Wang, Dawei Li, Shalini Ghosh, Abhishek Kolagunda, Yalin Wang
State-of-the-art deep model compression methods exploit the low-rank approximation and sparsity pruning to remove redundant parameters from a learned hidden layer.
no code implementations • 8 Jan 2019 • Anh T. Pham, Shalini Ghosh, Vinod Yegneswaran
In particular, we propose a method of masking the private data with privacy guarantee while ensuring that a classifier trained on the masked data is similar to the classifier trained on the original data, to maintain usability.
no code implementations • 16 Jul 2018 • Amir Asiaee, Hardik Goel, Shalini Ghosh, Vinod Yegneswaran, Arindam Banerjee
Stream deinterleaving is an important problem with various applications in the cybersecurity domain.
no code implementations • 18 May 2018 • Shalini Ghosh, Amaury Mercier, Dheeraj Pichapati, Susmit Jha, Vinod Yegneswaran, Patrick Lincoln
Experiments using our first approach of a multi-headed TNN model, on a dataset generated by a customized version of TORCS, show that (1) adding safety constraints to a neural network model results in increased performance and safety, and (2) the improvement increases with increasing importance of the safety constraints.
1 code implementation • 28 Apr 2018 • Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh
We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds.
no code implementations • 19 Feb 2016 • Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, Larry Heck
We evaluate CLSTM on three specific NLP tasks: word prediction, next sentence selection, and sentence topic prediction.
no code implementations • 16 Jul 2014 • Shalini Ghosh, Patrick Lincoln, Christian Petersen, Alfonso Valdes
In this paper, we address the problem of real-time detection of viruses docking to nanowires, especially when multiple viruses dock to the same nano-wire.
no code implementations • 13 Mar 2014 • Shalini Ghosh, Daniel Elenius, Wenchao Li, Patrick Lincoln, Natarajan Shankar, Wilfried Steiner
Requirements are informal and semi-formal descriptions of the expected behavior of a complex system from the viewpoints of its stakeholders (customers, users, operators, designers, and engineers).