no code implementations • 24 Oct 2020 • Henry Zhou, Alexei Baevski, Michael Auli
Neural latent variable models enable the discovery of interesting structure in speech audio data.
25 code implementations • NeurIPS 2020 • Alexei Baevski, Henry Zhou, Abdel-rahman Mohamed, Michael Auli
We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.
Ranked #1 on
Speech Recognition
on TIMIT
(using extra training data)