Unsupervised Speech Recognition
6 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Unsupervised Speech Recognition
Libraries
Use these libraries to find Unsupervised Speech Recognition models and implementationsLatest papers with no code
Enhancing Unsupervised Speech Recognition with Diffusion GANs
We enhance the vanilla adversarial training method for unsupervised Automatic Speech Recognition (ASR) by a diffusion-GAN.
Simple and Effective Unsupervised Speech Translation
The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages.
Simple and Effective Unsupervised Speech Synthesis
We introduce the first unsupervised speech synthesis system based on a simple, yet effective recipe.
Analyzing the Robustness of Unsupervised Speech Recognition
In this work, we further analyze the training robustness of unsupervised ASR on the domain mismatch scenarios in which the domains of unpaired speech and text are different.
Dynamic Gradient Aggregation for Federated Domain Adaptation
The proposed scheme is based on a weighted gradient aggregation using two-step optimization to offer a flexible training pipeline.
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning
In this paper we propose a Sequential Representation Quantization AutoEncoder (SeqRQ-AE) to learn from primarily unpaired audio data and produce sequences of representations very close to phoneme sequences of speech utterances.
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
However, we note human babies start to learn the language by the sounds (or phonetic structures) of a small number of exemplar words, and "generalize" such knowledge to other words without hearing a large amount of data.
Completely Unsupervised Speech Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Producing a large annotated speech corpus for training ASR systems remains difficult for more than 95% of languages all over the world which are low-resourced, but collecting a relatively big unlabeled data set for such languages is more achievable.
Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching
We consider the problem of training speech recognition systems without using any labeled data, under the assumption that the learner can only access to the input utterances and a phoneme language model estimated from a non-overlapping corpus.
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
This can be learned by aligning a small number of spoken words and the corresponding text words in the embedding spaces.