End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures

19 Nov 2019Gabriel SynnaeveQiantong XuJacob KahnEdouard GraveTatiana LikhomanenkoVineel PratapAnuroop SriramVitaliy LiptchinskyRonan Collobert

We study ResNet-, Time-Depth Separable ConvNets-, and Transformer-based acoustic models, trained with CTC or Seq2Seq criterions. We perform experiments on the LibriSpeech dataset, with and without LM decoding, optionally with beam rescoring... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Evaluation Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.