no code implementations • 28 Oct 2022 • Jason Fong, Yun Wang, Prabhav Agrawal, Vimal Manohar, JiLong Wu, Thilo Köhler, Qing He
Text-based voice editing (TBVE) uses synthetic output from text-to-speech (TTS) systems to replace words in an original recording.
1 code implementation • 4 May 2021 • Jennifer Williams, Jason Fong, Erica Cooper, Junichi Yamagishi
This work examines the content and usefulness of disentangled phone and speaker representations from two separately trained VQ-VAE systems: one trained on multilingual data and another trained on monolingual data.