1 code implementation • 8 Mar 2024 • Peng Liu, Dongyang Dai
Recent advancements in generative modeling have led to significant progress in audio waveform reconstruction from diverse representations.
no code implementations • 7 Oct 2021 • Dongyang Dai, Yuanzhe Chen, Li Chen, Ming Tu, Lu Liu, Rui Xia, Qiao Tian, Yuping Wang, Yuxuan Wang
(2) How to clone a person's voice while controlling the style and prosody.
no code implementations • 21 Dec 2020 • Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng
By using deep learning approaches, Speech Emotion Recog-nition (SER) on a single domain has achieved many excellentresults.
no code implementations • 20 Jun 2020 • Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng
Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.
no code implementations • 26 May 2020 • Dongyang Dai, Li Chen, Yu-Ping Wang, Mu Wang, Rui Xia, Xuchen Song, Zhiyong Wu, Yuxuan Wang
Firstly, the speech synthesis model is pre-trained with both multi-speaker clean data and noisy augmented data; then the pre-trained model is adapted on noisy low-resource new speaker data; finally, by setting the clean speech condition, the model can synthesize the new speaker's clean voice.