no code implementations • 18 Sep 2023 • Zheng-Yan Sheng, Yang Ai, Yan-Nian Chen, Zhen-Hua Ling
This paper presents a novel task, zero-shot voice conversion based on face images (zero-shot FaceVC), which aims at converting the voice characteristics of an utterance from any source speaker to a newly coming target speaker, solely relying on a single face image of the target speaker.
no code implementations • 3 Sep 2020 • Jing-Xuan Zhang, Li-Juan Liu, Yan-Nian Chen, Ya-Jun Hu, Yuan Jiang, Zhen-Hua Ling, Li-Rong Dai
In this paper, we present a ASR-TTS method for voice conversion, which used iFLYTEK ASR engine to transcribe the source speech into text and a Transformer TTS model with WaveNet vocoder to synthesize the converted speech from the decoded text.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4