Search Results for author: Leda Sari

Found 7 papers, 1 papers with code

Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions

no code implementations18 Nov 2021 Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf

This paper presents initial Speech Recognition results on "Casual Conversations" -- a publicly released 846 hour corpus designed to help researchers evaluate their computer vision and audio models for accuracy across a diverse set of metadata, including age, gender, and skin tone.

automatic-speech-recognition Fairness +1

Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering

no code implementations NAACL 2021 Kiran Ramnath, Leda Sari, Mark Hasegawa-Johnson, Chang Yoo

Three sub-tasks are proposed: (1) speech-to-text based, (2) end-to-end, without speech-to-text as an intermediate component, and (3) cross-lingual, in which the question is spoken in a language different from that in which the KG is recorded.

Knowledge Graphs Language understanding +2

A Multi-View Approach To Audio-Visual Speaker Verification

no code implementations11 Feb 2021 Leda Sari, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf

Although speaker verification has conventionally been an audio-only task, some practical applications provide both audio and visual streams of input.

Speaker Verification

Deep F-measure Maximization for End-to-End Speech Understanding

no code implementations8 Aug 2020 Leda Sari, Mark Hasegawa-Johnson

We propose a differentiable approximation to the F-measure and train the network with this objective using standard backpropagation.

Fairness Intent Detection +2

Identify Speakers in Cocktail Parties with End-to-End Attention

1 code implementation22 May 2020 Junzhe Zhu, Mark Hasegawa-Johnson, Leda Sari

In scenarios where multiple speakers talk at the same time, it is important to be able to identify the talkers accurately.

Speaker Identification Speech Separation

Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR

no code implementations14 Feb 2020 Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux

We propose an unsupervised speaker adaptation method inspired by the neural Turing machine for end-to-end (E2E) automatic speech recognition (ASR).

automatic-speech-recognition End-To-End Speech Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.