SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition

18 Apr 2019Daniel S. ParkWilliam ChanYu ZhangChung-Cheng ChiuBarret ZophEkin D. CubukQuoc V. Le

We present SpecAugment, a simple data augmentation method for speech recognition. SpecAugment is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients)... (read more)

PDF Abstract

Evaluation Results from the Paper


 SOTA for Speech Recognition on LibriSpeech test-clean (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
COMPARE
Speech Recognition LibriSpeech test-clean LAS Word Error Rate (WER) 3.20 # 8
Speech Recognition LibriSpeech test-clean LAS + SpecAugment Word Error Rate (WER) 2.50 # 3
Speech Recognition LibriSpeech test-other LAS + SpecAugment Word Error Rate (WER) 5.80 # 3
Speech Recognition Switchboard + Hub500 LAS + SpecAugment (SM) Percentage error 6.8 # 6