SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

18 Apr 2019Daniel S. ParkWilliam ChanYu ZhangChung-Cheng ChiuBarret ZophEkin D. CubukQuoc V. Le

We present SpecAugment, a simple data augmentation method for speech recognition. SpecAugment is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients)... (read more)

PDF Abstract

Evaluation results from the paper


 SOTA for Speech Recognition on LibriSpeech test-clean (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Compare
Speech Recognition LibriSpeech test-clean LAS Word Error Rate (WER) 3.20 # 5
Speech Recognition LibriSpeech test-clean LAS + SpecAugment Word Error Rate (WER) 2.50 # 1
Speech Recognition LibriSpeech test-other LAS + SpecAugment Word Error Rate (WER) 5.80 # 2
Speech Recognition Switchboard + Hub500 LAS + SpecAugment (SM) Percentage error 6.8 # 6