Search Results for author: Aparna Khare

Found 12 papers, 0 papers with code

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

no code implementations28 Mar 2024 Yash Jain, David Chan, Pranav Dheram, Aparna Khare, Olabanji Shonibare, Venkatesh Ravichandran, Shalini Ghosh

Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

no code implementations26 Jan 2024 Jinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran

We propose an approach for continuous prediction of turn-taking and backchanneling locations in spoken dialogue by fusing a neural acoustic model with a large language model (LLM).

Language Modelling Large Language Model

Cross-utterance ASR Rescoring with Graph-based Label Propagation

no code implementations27 Mar 2023 Srinath Tankasala, Long Chen, Andreas Stolcke, Anirudh Raju, Qianli Deng, Chander Chandak, Aparna Khare, Roland Maas, Venkatesh Ravichandran

We propose a novel approach for ASR N-best hypothesis rescoring with graph-based label propagation by leveraging cross-utterance acoustic similarity.

Fairness Language Modelling

Guided contrastive self-supervised pre-training for automatic speech recognition

no code implementations22 Oct 2022 Aparna Khare, Minhua Wu, Saurabhchand Bhati, Jasha Droppo, Roland Maas

Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

ASR-Aware End-to-end Neural Diarization

no code implementations2 Feb 2022 Aparna Khare, Eunjung Han, Yuguang Yang, Andreas Stolcke

We present a Conformer-based end-to-end neural diarization (EEND) model that uses both acoustic input and features derived from an automatic speech recognition (ASR) model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Audiovisual Highlight Detection in Videos

no code implementations11 Feb 2021 Karel Mundnich, Alexandra Fenster, Aparna Khare, Shiva Sundaram

To better study the task of highlight detection, we run a pilot experiment with highlights annotations for a small subset of video clips and fine-tune our best model on it.

Highlight Detection Object Recognition +2

Self-Supervised learning with cross-modal transformers for emotion recognition

no code implementations20 Nov 2020 Aparna Khare, Srinivas Parthasarathy, Shiva Sundaram

Self-supervised learning has shown improvements on tasks with limited labeled datasets in domains like speech and natural language.

Emotion Recognition Language Modelling +4

Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

no code implementations1 Feb 2020 Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram

Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.