State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions

1 Oct 2019Kyu J. HanRamon PrietoKaixing WuTao Ma

Self-attention has been a huge success for many downstream tasks in NLP, which led to exploration of applying self-attention to speech problems as well. The efficacy of self-attention in speech applications, however, seems not fully blown yet since it is challenging to handle highly correlated speech frames in the context of self-attention... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Evaluation Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK COMPARE
Speech Recognition LibriSpeech test-clean Multi-Stream Self-Attention With Dilated 1D Convolutions Word Error Rate (WER) 2.20 # 1
Speech Recognition LibriSpeech test-other Multi-Stream Self-Attention With Dilated 1D Convolutions Word Error Rate (WER) 5.80 # 3