Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis

NeurIPS 2018 Ye JiaYu ZhangRon J. WeissQuan WangJonathan ShenFei RenZhifeng ChenPatrick NguyenRuoming PangIgnacio Lopez MorenoYonghui Wu

Clone a voice in 5 seconds to generate arbitrary speech in real-time..

Evaluation results from the paper

 SOTA for Text-To-Speech Synthesis on LJSpeech (using extra training data)

Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Text-To-Speech Synthesis LJSpeech tacotron Accuracy 12 # 1