Search Results for author: Chenye Cui

Found 7 papers, 3 papers with code

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

no code implementations18 May 2023 Jinzheng He, Jinglin Liu, Zhenhui Ye, Rongjie Huang, Chenye Cui, Huadai Liu, Zhou Zhao

To tackle these challenges, we propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input, eliminating most of the tedious manual annotation and avoiding the aforementioned inconvenience.

Singing Voice Synthesis

VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement

no code implementations19 Nov 2022 Chenye Cui, Yi Ren, Jinglin Liu, Rongjie Huang, Zhou Zhao

In this paper, we pose the task of generating sound with a specific timbre given a video input and a reference audio sample.

Disentanglement

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

3 code implementations13 Jul 2022 Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren

Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.

Denoising Knowledge Distillation +3

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech

2 code implementations15 May 2022 Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao

Style transfer for out-of-domain (OOD) speech synthesis aims to generate speech samples with unseen style (e. g., speaker identity, emotion, and prosody) derived from an acoustic reference, while facing the following challenges: 1) The highly dynamic style features in expressive voice are difficult to model and transfer; and 2) the TTS models should be robust enough to handle diverse OOD conditions that differ from the source data.

Speech Synthesis Style Transfer +1

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

2 code implementations MM '21: Proceedings of the 29th ACM International Conference on Multimedia 2021 Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao

High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the singing voice data shortage, limited singer generalization, and large computational cost.

Audio Generation Singing Voice Synthesis +1

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

no code implementations17 Jun 2021 Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao

Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.

Emotional Speech Synthesis Emotion Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.