Search Results for author: Matthew Snover

Found 5 papers, 0 papers with code

Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech

no code implementations • 25 Mar 2024 • Shannon Wotherspoon, William Hartmann, Matthew Snover

This paper introduces a set of English translations for a 123-hour subset of the CallHome Mandarin Chinese data and the HKUST Mandarin Telephone Speech data for the task of speech translation.

Translation

Paper
Add Code

Training Autoregressive Speech Recognition Models with Limited in-domain Supervision

no code implementations • 27 Oct 2022 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover

Advances in self-supervised learning have significantly reduced the amount of transcribed audio required for training.

Self-Supervised Learning speech-recognition +1

Paper
Add Code

Combining Unsupervised and Text Augmented Semi-Supervised Learning for Low Resourced Autoregressive Speech Recognition

no code implementations • 29 Oct 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover

Final performance is an additional 2% better absolute when using CTC-based decoding for semi-supervised training compared to shallow fusion.

Domain Adaptation Language Modelling +3

Paper
Add Code

Overcoming Domain Mismatch in Low Resource Sequence-to-Sequence ASR Models using Hybrid Generated Pseudotranscripts

no code implementations • 14 Jun 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover, Owen Kimball

We show that there is a sizable initial gap in such a data condition between hybrid and seq2seq models, and the hybrid model is able to further improve through the use of additional language model (LM) data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Using heterogeneity in semi-supervised transcription hypotheses to improve code-switched speech recognition

no code implementations • 14 Jun 2021 • Andrew Slottje, Shannon Wotherspoon, William Hartmann, Matthew Snover, Owen Kimball

Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.