End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

3 Feb 2020Takenori YoshimuraTomoki HayashiKazuya TakedaShinji Watanabe

This paper integrates a voice activity detection (VAD) function with end-to-end automatic speech recognition toward an online speech interface and transcribing very long audio recordings. We focus on connectionist temporal classification (CTC) and its extension of CTC/attention architectures... (read more)

PDF Abstract

Evaluation Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.