no code implementations • 12 Aug 2024 • Max Nelson, Shannon Wotherspoon, Francis Keith, William Hartmann, Matthew Snover
While transcriptions exist for a number of languages, translated conversational speech is rare and datasets containing summaries are non-existent.
no code implementations • 12 Aug 2024 • Chak-Fai Li, William Hartmann, Matthew Snover
We propose the TOGGL model to simultaneously transcribe the speech of multiple speakers.
no code implementations • 25 Mar 2024 • Shannon Wotherspoon, William Hartmann, Matthew Snover
This paper introduces a set of English translations for a 123-hour subset of the CallHome Mandarin Chinese data and the HKUST Mandarin Telephone Speech data for the task of speech translation.
no code implementations • 16 Jan 2024 • Jonathan Lasko, Jeff Ma, Mike Nicoletti, Jonathan Sussman-Fort, Sooyoung Jeong, William Hartmann
Cognitive load classification is the task of automatically determining an individual's utilization of working memory resources during performance of a task based on physiologic measures such as electroencephalography (EEG).
no code implementations • 27 Oct 2022 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover
Advances in self-supervised learning have significantly reduced the amount of transcribed audio required for training.
no code implementations • 29 Oct 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover
Final performance is an additional 2% better absolute when using CTC-based decoding for semi-supervised training compared to shallow fusion.
no code implementations • 14 Jun 2021 • Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover, Owen Kimball
We show that there is a sizable initial gap in such a data condition between hybrid and seq2seq models, and the hybrid model is able to further improve through the use of additional language model (LM) data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 14 Jun 2021 • Andrew Slottje, Shannon Wotherspoon, William Hartmann, Matthew Snover, Owen Kimball
Labeled code-switched data are rare, so monolingual data are often used to model code-switched speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • Le Zhang, Damianos Karakos, William Hartmann, Manaj Srivastava, Lee Tarlin, David Akodes, Sanjay Krishna Gouda, Numra Bathool, Lingjun Zhao, Zhuolin Jiang, Richard Schwartz, John Makhoul
In this paper, we describe a cross-lingual information retrieval (CLIR) system that, given a query in English, and a set of audio and text documents in a foreign language, can return a scored list of relevant documents, and present findings in a summary form in English.
no code implementations • LREC 2020 • Damianos Karakos, Rabih Zbib, William Hartmann, Richard Schwartz, John Makhoul
In the IARPA MATERIAL program, information retrieval (IR) is treated as a hard detection problem; the system has to output a single global ranking over all queries, and apply a hard threshold on this global list to come up with all the hypothesized relevant documents.
no code implementations • 1 May 2020 • Zhuolin Jiang, Jan Silovsky, Man-Hung Siu, William Hartmann, Herbert Gish, Sancar Adali
Multi-label image classification has generated significant interest in recent years and the performance of such systems often suffers from the not so infrequent occurrence of incorrect or missing labels in the training data.
1 code implementation • LREC 2020 • Zhuolin Jiang, Amro El-Jaroudi, William Hartmann, Damianos Karakos, Lingjun Zhao
Multiple neural language models have been developed recently, e. g., BERT and XLNet, and achieved impressive results in various NLP tasks including sentence classification, question answering and document ranking.
no code implementations • 18 Sep 2019 • Herbert Gish, Jan Silovsky, Man-Ling Sung, Man-Hung Siu, William Hartmann, Zhuolin Jiang
This includes results about the ability of the noisy model to make the same decisions as the clean model and the effects of noise on model performance.