no code implementations • 25 Mar 2024 • Shannon Wotherspoon, William Hartmann, Matthew Snover
This paper introduces a set of English translations for a 123-hour subset of the CallHome Mandarin Chinese data and the HKUST Mandarin Telephone Speech data for the task of speech translation.
no code implementations • 27 Oct 2022 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover
Advances in self-supervised learning have significantly reduced the amount of transcribed audio required for training.
no code implementations • 29 Oct 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover
Final performance is an additional 2% better absolute when using CTC-based decoding for semi-supervised training compared to shallow fusion.
no code implementations • 14 Jun 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover, Owen Kimball
We show that there is a sizable initial gap in such a data condition between hybrid and seq2seq models, and the hybrid model is able to further improve through the use of additional language model (LM) data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 14 Jun 2021 • Andrew Slottje, Shannon Wotherspoon, William Hartmann, Matthew Snover, Owen Kimball
Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1