Search Results for author: Andrew Senior

Found 10 papers, 5 papers with code

Deep Neural Networks for Acoustic Modeling in Speech Recognition

no code implementations • Signal Processing Magazine 2012 • Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, Brian Kingsbury

Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models to determine how well each state of each HMM ﬁts a frame or a short window of frames of coefﬁcients that represents the acoustic input.

speech-recognition Speech Recognition

Paper
Add Code

Large Scale Distributed Deep Networks

no code implementations • NeurIPS 2012 • Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc V. Le, Andrew Y. Ng

Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance.

Object Recognition speech-recognition +1

Paper
Add Code

Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition

1 code implementation • 5 Feb 2014 • Haşim Sak, Andrew Senior, Françoise Beaufays

However, in contrast to the deep neural networks, the use of RNNs in speech recognition has been limited to phone recognition in small scale tasks.

Handwriting Recognition Language Modelling +2

Paper
Code

Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition

no code implementations • 24 Jul 2015 • Haşim Sak, Andrew Senior, Kanishka Rao, Françoise Beaufays

We have recently shown that deep Long Short-Term Memory (LSTM) recurrent neural networks (RNNs) outperform feed forward deep neural networks (DNNs) as acoustic models for speech recognition.

General Classification speech-recognition +1

Paper
Add Code

Latent Predictor Networks for Code Generation

2 code implementations • ACL 2016 • Wang Ling, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Andrew Senior, Fumin Wang, Phil Blunsom

Many language generation tasks require the production of text conditioned on both structured and unstructured inputs.

Ranked #10 on Code Generation on Django

Card Games Code Generation +1

239

Paper
Code

WaveNet: A Generative Model for Raw Audio

60 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.

Ranked #1 on Speech Synthesis on Mandarin Chinese

Audio Generation Speech Synthesis

5,400

Paper
Code

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

no code implementations • NeurIPS 2016 • Jack W. Rae, Jonathan J. Hunt, Tim Harley, Ivo Danihelka, Andrew Senior, Greg Wayne, Alex Graves, Timothy P. Lillicrap

SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\! 000$s of time steps and memories.

Ranked #6 on Question Answering on bAbi (Mean Error Rate metric)

Language Modelling Machine Translation +2

Paper
Add Code

Lip Reading Sentences in the Wild

1 code implementation • CVPR 2017 • Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman

The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio.

Ranked #4 on Lipreading on GRID corpus (mixed-speech) (using extra training data)

Lipreading Lip Reading +2

Paper
Code

Large-Scale Visual Speech Recognition

no code implementations • ICLR 2019 • Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3, 886 hours of video).

Ranked #11 on Lipreading on LRS3-TED (using extra training data)

Lipreading speech-recognition +1

Paper
Add Code

Deep Audio-Visual Speech Recognition

4 code implementations • 6 Sep 2018 • Triantafyllos Afouras, Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman

The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio.

Ranked #6 on Audio-Visual Speech Recognition on LRS2

Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +4

186

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.