no code implementations • 19 Dec 2013 • Taichi Kiwaki, Takaki Makino, Kazuyuki Aihara
We pursue an early stopping technique that helps Gaussian Restricted Boltzmann Machines (GRBMs) to gain good natural image representations in terms of overcompleteness and data fitting.
1 code implementation • 8 Nov 2019 • Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan
This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture.
Ranked #5 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)
no code implementations • 11 May 2022 • Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao
Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2