English Conversational Telephone Speech Recognition by Humans and Machines

6 Mar 2017George SaonGakuto KurataTom SercuKartik AudhkhasiSamuel ThomasDimitrios DimitriadisXiaodong CuiBhuvana RamabhadranMichael PichenyLynn-Li LimBergul RoomiPhil Hall

One of the most difficult speech recognition tasks is accurate recognition of human to human communication. Advances in deep learning over the last few years have produced major speech recognition improvements on the representative Switchboard conversational corpus... (read more)

PDF Abstract

Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Speech Recognition swb_hub_500 WER fullSWBCH ResNet + BiLSTMs acoustic model Percentage error 10.3 # 1
Speech Recognition Switchboard + Hub500 ResNet + BiLSTMs acoustic model Percentage error 5.5 # 1