no code implementations • 6 Sep 2023 • Yuankun Xie, Haonan Cheng, Yutian Wang, Long Ye
Specifically, our approach involves two novel parts: embedding similarity module and temporal convolution operation.
1 code implementation • 5 Sep 2023 • Yuankun Xie, Jingjing Zhou, Xiaolin Lu, Zhenghao Jiang, Yuxin Yang, Haonan Cheng, Long Ye
In this paper, we initially construct a Chinese Fake Song Detection (FSD) dataset to investigate the field of song deepfake detection.
no code implementations • 7 Apr 2022 • Yutian Wang, Yuankun Xie, Kun Zhao, Hui Wang, Qin Zhang
In this paper, we propose a novel prosody disentangle method for prosodic Text-to-Speech (TTS) model, which introduces the vector quantization (VQ) method to the auxiliary prosody encoder to obtain the decomposed prosody representations in an unsupervised manner.