The multiple-hypothesis approach yields a relative reduction of 3. 3% WER on the CHiME-4's single-channel real noisy evaluation set when compared with the single-hypothesis approach.
In this paper, we propose an online attention mechanism, known as cumulative attention (CA), for streaming Transformer-based automatic speech recognition (ASR).
Online Transformer-based automatic speech recognition (ASR) systems have been extensively studied due to the increasing demand for streaming applications.
In this paper, we present Adaptive Computation Steps (ACS) algo-rithm, which enables end-to-end speech recognition models to dy-namically decide how many frames should be processed to predict a linguistic output.
Ranked #7 on Speech Recognition on AISHELL-1