Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion

6 Apr 2019Hao SunXu TanJun-Wei GanHongzhi LiuSheng ZhaoTao QinTie-Yan Liu

Grapheme-to-phoneme (G2P) conversion is an important task in automatic speech recognition and text-to-speech systems. Recently, G2P conversion is viewed as a sequence to sequence task and modeled by RNN or CNN based encoder-decoder framework... (read more)

PDF Abstract

Evaluation Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK COMPARE
Text-To-Speech Synthesis CMUDict 0.7b Token-Level Ensemble Distillation Word Error Rate (WER) 19.88% # 1
Text-To-Speech Synthesis CMUDict 0.7b Token-Level Ensemble Distillation Phoneme Error Rate 4.6% # 1