Search Results for author: Olivier Siohan

Found 8 papers, 1 papers with code

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection

no code implementations11 May 2022 Otavio Braga, Olivier Siohan

As an alternative, recent work has proposed to address the two problems simultaneously with an attention mechanism, baking the speaker selection problem directly into a fully differentiable model.

Automatic Speech Recognition

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition

no code implementations11 May 2022 Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao

Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio.

Automatic Speech Recognition Face Selection

Best of Both Worlds: Multi-task Audio-Visual Automatic Speech Recognition and Active Speaker Detection

no code implementations10 May 2022 Otavio Braga, Olivier Siohan

Under noisy conditions, automatic speech recognition (ASR) can greatly benefit from the addition of visual signals coming from a video of the speaker's face.

Automatic Speech Recognition

End-to-end multi-talker audio-visual ASR using an active speaker attention module

no code implementations1 Apr 2022 Richard Rose, Olivier Siohan

This paper presents a new approach for end-to-end audio-visual multi-talker speech recognition.

Speech Recognition

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models

no code implementations25 Apr 2021 Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao

To improve streaming models, a recent study [1] proposed to distill a non-streaming teacher model on unsupervised utterances, and then train a streaming student using the teachers' predictions.

Automatic Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.