no code implementations • 30 Jan 2024 • Allen Chen, Okan Tanrikulu
QA models are faced with complex and open-ended contextual reasoning problems, but can often learn well-performing solution heuristics by exploiting dataset-specific patterns in their training data.
no code implementations • 8 Jan 2024 • Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic
In this paper, we eliminate the hypothesis-audio mismatch problem by querying the correction database directly using embeddings derived from the utterance audio; the embeddings of the utterance audio and candidate corrections are produced by multimodal speech-text embedding networks trained to place the embedding of the audio of an utterance and the embedding of its corresponding textual transcript close together.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1