Search Results for author: Kshitiz Kumar

Found 5 papers, 0 papers with code

Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss

no code implementations11 Aug 2023 Mohammad Soleymanpour, Mahmoud Al Ismail, Fahimeh Bahmaninezhad, Kshitiz Kumar, Jian Wu

Our key developments constitute: (a) pronunciation lexicon with grapheme units instead of phone units, (b) a fully bilingual alignment model and subsequently bilingual streaming transformer model, (c) a parallel encoder structure with language identification (LID) loss, (d) parallel encoder with an auxiliary loss for monolingual projections.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

no code implementations7 Feb 2022 Daniel Tompkins, Kshitiz Kumar, Jian Wu

An Xception model reaches state-of-the-art (SOTA) accuracy on the ESC-50 dataset for audio event detection through knowledge transfer from ImageNet weights, pretraining on AudioSet, and an on-the-fly data augmentation pipeline.

Data Augmentation Event Detection +1

Sequence-level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models

no code implementations30 Jun 2021 Amber Afshan, Kshitiz Kumar, Jian Wu

We propose a cost-effective method of using CC scores to select an optimal adaptation data set, where we maximize ASR gains from minimal data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

no code implementations12 Aug 2020 Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li

Transfer learning (TL) is widely used in conventional hybrid automatic speech recognition (ASR) system, to transfer the knowledge from source to target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Speaker Adaptation for End-to-End CTC Models

no code implementations4 Jan 2019 Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong

We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.