no code implementations • 27 Jun 2023 • Yahuan Cong, Haoyu Zhang, Haopeng Lin, Shichao Liu, Chunfeng Wang, Yi Ren, Xiang Yin, Zejun Ma
Cross-lingual timbre and style generalizable text-to-speech (TTS) aims to synthesize speech with a specific reference timbre or style that is never trained in the target language.
no code implementations • 13 Feb 2020 • Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang
Speaker diarization, which is to find the speech segments of specific speakers, has been widely used in human-centered applications such as video conferences or human-computer interaction systems.