no code implementations • ICLR 2018 • Anuroop Sriram, Heewoo Jun, Sanjeev Satheesh, Adam Coates
Sequence-to-sequence (Seq2Seq) models with attention have excelled at tasks which involve generating natural language sentences such as machine translation, image captioning and speech recognition.
no code implementations • 24 Jul 2017 • Eric Battenberg, Jitong Chen, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
In this work, we perform an empirical comparison among the CTC, RNN-Transducer, and attention-based Seq2Seq models for end-to-end speech recognition.
no code implementations • ICLR 2018 • Han Zhao, Zhenyao Zhu, Junjie Hu, Adam Coates, Geoff Gordon
This provides us a very general way to interpolate between generative and discriminative extremes through different choices of priors.
no code implementations • 11 May 2017 • Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu
Replacing hand-engineered pipelines with end-to-end deep learning systems has enabled strong results in applications like speech and object recognition.
no code implementations • 15 Mar 2017 • Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Keyword spotting (KWS) constitutes a major component of human-technology interfaces.
3 code implementations • ICML 2017 • Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, Yongguo Kang, Xi-An Li, John Miller, Andrew Ng, Jonathan Raiman, Shubho Sengupta, Mohammad Shoeybi
We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks.
no code implementations • 10 Dec 2016 • Jiaji Huang, Rewon Child, Vinay Rao, Hairong Liu, Sanjeev Satheesh, Adam Coates
For speech recognition, confidence scores and other likelihood-based active learning methods have been shown to be effective.
36 code implementations • 8 Dec 2015 • Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages.
no code implementations • 7 Apr 2015 • Brody Huval, Tao Wang, Sameep Tandon, Jeff Kiske, Will Song, Joel Pazhayampallil, Mykhaylo Andriluka, Pranav Rajpurkar, Toki Migimatsu, Royce Cheng-Yue, Fernando Mujica, Adam Coates, Andrew Y. Ng
We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection.
Ranked #2 on Lane Detection on Caltech Lanes Cordova
24 code implementations • 17 Dec 2014 • Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng
We present a state-of-the-art speech recognition system developed using end-to-end deep learning.
no code implementations • 24 Dec 2013 • Brody Huval, Adam Coates, Andrew Ng
We investigate the use of deep neural networks for the novel task of class generic object detection.
no code implementations • NeurIPS 2012 • Adam Coates, Andrej Karpathy, Andrew Y. Ng
Recent work in unsupervised feature learning has focused on the goal of discovering high-level features from unlabeled images.
no code implementations • NeurIPS 2011 • Adam Coates, Andrew Y. Ng
Recent deep learning and unsupervised feature learning systems that learn from unlabeled data have achieved high performance in benchmarks by using extremely large architectures with many features (hidden units) at each layer.