Robust Speech Recognition

22 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

yuchen005/robustger 19 Jan 2024

To this end, we propose to extract a language-space noise embedding from the N-best list to represent the noise conditions of source speech, which can promote the denoising process in GER.

54
19 Jan 2024

Single Channel Speech Enhancement Using U-Net Spiking Neural Networks

riaa3102/SESNNet 26 Jul 2023

Speech enhancement (SE) is crucial for reliable communication devices or robust speech recognition systems.

1
26 Jul 2023

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

zhuole1025/LyricWhiz 29 Jun 2023

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.

32
29 Jun 2023

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

facebookresearch/muavic 1 Mar 2023

We introduce MuAViC, a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation providing 1200 hours of audio-visual speech in 9 languages.

335
01 Mar 2023

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

yuchen005/gradient-remedy 22 Feb 2023

In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.

11
22 Feb 2023

Audio-Visual Efficient Conformer for Robust Speech Recognition

burchim/avec 4 Jan 2023

We improve previous lip reading methods using an Efficient Conformer back-end on top of a ResNet-18 visual front-end and by adding intermediate CTC losses between blocks.

74
04 Jan 2023

Robust Speech Recognition via Large-Scale Weak Supervision

huggingface/transformers Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

124,527
06 Dec 2022

CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations

speech-lab-iitm/ccc-wav2vec-2.0 5 Oct 2022

While Self-Supervised Learning has helped reap the benefit of the scale from the available unlabeled data, the learning paradigms are continuously being bettered.

11
05 Oct 2022

DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition

guozixunnicolas/dent-ddsp 1 Aug 2022

Moreover, to validate whether the data simulated by DENT-DDSP are able to replace the scarce in-domain noisy data in the noise-robust ASR tasks, several downstream ASR models with the same architecture are trained using the simulated data and the real data.

19
01 Aug 2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

espnet/espnet 19 Jul 2022

To showcase such integration, we performed experiments on carefully designed synthetic datasets for noisy-reverberant multi-channel ST and SLU tasks, which can be used as benchmark corpora for future research.

7,858
19 Jul 2022