2 code implementations • 19 Mar 2019 • Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, C. -C. Jay Kuo
The idea is to first train a separate model only for the new classes, and then combine the two individual models trained on data of two distinct set of classes (old classes and new classes) via a novel double distillation training objective.
1 code implementation • 28 Apr 2018 • Sridhar Mahadevan, Bamdev Mishra, Shalini Ghosh
We present a novel framework for domain adaptation, whereby both geometric and statistical differences between a labeled source domain and unlabeled target domain can be integrated by exploiting the curved Riemannian geometry of statistical manifolds.
1 code implementation • 4 Jan 2024 • David M. Chan, Shalini Ghosh, Hitesh Tulsiani, Ariya Rastrow, Björn Hoffmeister
We demonstrate that our CLC family of approaches can improve the performance of ASR models on OD3, a new public large-scale semi-synthetic meta-dataset of audio task-oriented dialogues, by up to 19. 2%.
no code implementations • 18 May 2018 • Shalini Ghosh, Amaury Mercier, Dheeraj Pichapati, Susmit Jha, Vinod Yegneswaran, Patrick Lincoln
Experiments using our first approach of a multi-headed TNN model, on a dataset generated by a customized version of TORCS, show that (1) adding safety constraints to a neural network model results in increased performance and safety, and (2) the improvement increases with increasing importance of the safety constraints.
no code implementations • 19 Feb 2016 • Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, Larry Heck
We evaluate CLSTM on three specific NLP tasks: word prediction, next sentence selection, and sentence topic prediction.
no code implementations • 13 Mar 2014 • Shalini Ghosh, Daniel Elenius, Wenchao Li, Patrick Lincoln, Natarajan Shankar, Wilfried Steiner
Requirements are informal and semi-formal descriptions of the expected behavior of a complex system from the viewpoints of its stakeholders (customers, users, operators, designers, and engineers).
no code implementations • 16 Jul 2014 • Shalini Ghosh, Patrick Lincoln, Christian Petersen, Alfonso Valdes
In this paper, we address the problem of real-time detection of viruses docking to nanowires, especially when multiple viruses dock to the same nano-wire.
no code implementations • 16 Jul 2018 • Amir Asiaee, Hardik Goel, Shalini Ghosh, Vinod Yegneswaran, Arindam Banerjee
Stream deinterleaving is an important problem with various applications in the cybersecurity domain.
no code implementations • 8 Jan 2019 • Anh T. Pham, Shalini Ghosh, Vinod Yegneswaran
In particular, we propose a method of masking the private data with privacy guarantee while ensuring that a classifier trained on the masked data is similar to the classifier trained on the original data, to maintain usability.
no code implementations • 3 Feb 2019 • Jie Zhang, Xiaolong Wang, Dawei Li, Shalini Ghosh, Abhishek Kolagunda, Yalin Wang
State-of-the-art deep model compression methods exploit the low-rank approximation and sparsity pruning to remove redundant parameters from a learned hidden layer.
no code implementations • ICCV 2019 • Ramprasaath R. Selvaraju, Stefan Lee, Yilin Shen, Hongxia Jin, Shalini Ghosh, Larry Heck, Dhruv Batra, Devi Parikh
Many vision and language models suffer from poor visual grounding - often falling back on easy-to-learn language priors rather than basing their decisions on visual concepts in the image.
no code implementations • 15 Feb 2019 • Shalini Ghosh, Giedrius Burachas, Arijit Ray, Avi Ziskind
In this paper, we present a novel approach for the task of eXplainable Question Answering (XQA), i. e., generating natural language (NL) explanations for the Visual Question Answering (VQA) problem.
no code implementations • 26 Feb 2019 • Heming Zhang, Shalini Ghosh, Larry Heck, Stephen Walsh, Junting Zhang, Jie Zhang, C. -C. Jay Kuo
The key challenge of generative Visual Dialogue (VD) systems is to respond to human queries with informative answers in natural and contiguous conversation flow.
no code implementations • 20 Mar 2019 • Jie Zhang, Junting Zhang, Shalini Ghosh, Dawei Li, Jingwen Zhu, Heming Zhang, Yalin Wang
Lifelong learning, the problem of continual learning where tasks arrive in sequence, has been lately attracting more attention in the computer vision community.
no code implementations • 26 Mar 2019 • Dawei Li, Serafettin Tasci, Shalini Ghosh, Jingwen Zhu, Junting Zhang, Larry Heck
The key component of RILOD is a novel incremental learning algorithm that trains end-to-end for one-stage deep object detection models only using training data of new object classes.
no code implementations • 7 Feb 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multimodal ML models can process data in multiple modalities (e. g., video, images, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding).
no code implementations • 7 Mar 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multi-modal machine learning (ML) models can process data in multiple modalities (e. g., video, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding, activity recognition).
no code implementations • 5 Jun 2020 • Palash Goyal, Shalini Ghosh
We theoretically show that the proposed loss function is a tighter bound of 0-1 loss compared to any other loss satisfying the hierarchical constraints.
no code implementations • 12 Oct 2021 • David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister
Traditionally, research in automated speech recognition has focused on local-first encoding of audio representations to predict the spoken phonemes in an utterance.
no code implementations • 13 May 2022 • Soumyajit Mitra, Swayambhu Nath Ray, Bharat Padi, Arunasish Sen, Raghavendra Bilgi, Harish Arsikere, Shalini Ghosh, Ajay Srinivasamurthy, Sri Garimella
Modern Automatic Speech Recognition (ASR) systems often use a portfolio of domain-specific models in order to get high accuracy for distinct user utterance types across different devices.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 19 May 2022 • David M. Chan, Shalini Ghosh
Deep neural networks have largely demonstrated their ability to perform automated speech recognition (ASR) by extracting meaningful features from input audio frames.
no code implementations • NAACL 2022 • Zhekun Luo, Shalini Ghosh, Devin Guillory, Keizo Kato, Trevor Darrell, Huijuan Xu
In this paper, we aim to improve the generalization ability of the compositional action recognition model to novel verbs or novel nouns that are unseen during training time, by leveraging the power of knowledge graphs.
no code implementations • 6 Jan 2023 • David M. Chan, Shalini Ghosh, Ariya Rastrow, Björn Hoffmeister
Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning).
no code implementations • 4 Apr 2023 • Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza
Currently popular video-text data mining approach via automatic speech recognition (ASR) used in HowTo100M provides low-quality captions that often do not refer to the video content.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 8 Aug 2023 • Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
Here we propose an automatic red teaming framework that evaluates a given model and exposes its vulnerabilities against unsafe and inappropriate content generation.
no code implementations • 27 Sep 2023 • Chao-Han Huck Yang, Yile Gu, Yi-Chieh Liu, Shalini Ghosh, Ivan Bulyko, Andreas Stolcke
We explore the ability of large language models (LLMs) to act as speech recognition post-processors that perform rescoring and error correction.
Ranked #2 on Speech Recognition on WSJ eval92 (using extra training data)
no code implementations • 26 Sep 2023 • Yu Yu, Chao-Han Huck Yang, Jari Kolehmainen, Prashanth G. Shivakumar, Yile Gu, Sungho Ryu, Roger Ren, Qi Luo, Aditya Gourav, I-Fan Chen, Yi-Chieh Liu, Tuan Dinh, Ankur Gandhe, Denis Filimonov, Shalini Ghosh, Andreas Stolcke, Ariya Rastow, Ivan Bulyko
We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring.
no code implementations • 16 Nov 2023 • Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Jwala Dhamala, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance.
no code implementations • 22 Dec 2023 • Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, Phani Sankar Nidadavolu
In cases where some data/compute is available, we present Learnable-MAM, a data-driven approach to merging attention matrices, resulting in a further 2. 90% relative reduction in WER for ASR and 18. 42% relative reduction in AEC compared to fine-tuning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 23 Dec 2023 • Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko
Specifically, our framework serializes tasks in the order of current paralinguistic attribute prediction, response paralinguistic attribute prediction, and response text generation with autoregressive conditioning.
no code implementations • 5 Jan 2024 • Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke
In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text.