Search Results for author: Takaki Makino

Found 3 papers, 1 papers with code

End-to-End Multi-Person Audio/Visual Automatic Speech Recognition

no code implementations • 11 May 2022 • Otavio Braga, Takaki Makino, Olivier Siohan, Hank Liao

Traditionally, audio-visual automatic speech recognition has been studied under the assumption that the speaking face on the visual signal is the face matching the audio.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

1 code implementation • 8 Nov 2019 • Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture.

Ranked #5 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)

Audio-Visual Speech Recognition Lipreading +2

Paper
Code

Approximated Infomax Early Stopping: Revisiting Gaussian RBMs on Natural Images

no code implementations • 19 Dec 2013 • Taichi Kiwaki, Takaki Makino, Kazuyuki Aihara

We pursue an early stopping technique that helps Gaussian Restricted Boltzmann Machines (GRBMs) to gain good natural image representations in terms of overcompleteness and data fitting.

Attribute

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.