Search Results for author: Saurabhchand Bhati

Found 8 papers, 1 papers with code

Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning

no code implementations8 Sep 2023 Saurabhchand Bhati, Jesús Villalba, Laureano Moro-Velazquez, Thomas Thebaud, Najim Dehak

Cascaded SpeechCLIP attempted to generate localized word-level information and utilize both the pretrained image and text encoders.

audio-visual learning Quantization +1

Regularizing Contrastive Predictive Coding for Speech Applications

no code implementations12 Apr 2023 Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velazquez, Najim Dehak

These representations significantly reduce the amount of labeled data needed for downstream task performance, such as automatic speech recognition.

Acoustic Unit Discovery Automatic Speech Recognition +3

Guided contrastive self-supervised pre-training for automatic speech recognition

no code implementations22 Oct 2022 Aparna Khare, Minhua Wu, Saurabhchand Bhati, Jasha Droppo, Roland Maas

Contrastive Predictive Coding (CPC) is a representation learning method that maximizes the mutual information between intermediate latent representations and the output of a given model.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

1 code implementation26 Jan 2022 Piotr Żelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak

In this paper, we 1) investigate the influence of different factors (i. e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language; 2) provide an analysis of which phones transfer well across languages and which do not in order to understand the limitations of and areas for further improvement for automatic phone inventory creation; and 3) present different methods to build a phone inventory of an unseen language in an unsupervised way.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation

no code implementations3 Jun 2021 Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velazquez, Najim Dehak

We overcome this limitation with a segmental contrastive predictive coding (SCPC) framework that can model the signal structure at a higher level e. g. at the phoneme level.

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery

no code implementations26 Jul 2020 Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Najim Dehak

We perform segmentation based on the assumption that the frame feature vectors are more similar within a segment than across the segments.

Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.