no code implementations • 1 Dec 2017 • Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze
We are working on a corpus of "how-to" videos from the web, and the idea is that an object that can be seen ("car"), or a scene that is being detected ("kitchen") can be used to condition both models on the "context" of the recording, thereby reducing perplexity and improving transcription.
4 code implementations • 29 Jul 2015 • Yajie Miao, Mohammad Gowayyed, Florian Metze
The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 27 Jan 2014 • Yajie Miao
Using these recipes, we can build up multiple systems including DNN hybrid systems, convolutional neural network (CNN) systems and bottleneck feature systems.