Search Results for author: Chenye Cui

Found 7 papers, 3 papers with code

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

no code implementations • 18 May 2023 • Jinzheng He, Jinglin Liu, Zhenhui Ye, Rongjie Huang, Chenye Cui, Huadai Liu, Zhou Zhao

To tackle these challenges, we propose RMSSinger, the first RMS-SVS method, which takes realistic music scores as input, eliminating most of the tedious manual annotation and avoiding the aforementioned inconvenience.

Singing Voice Synthesis

Paper
Add Code

VarietySound: Timbre-Controllable Video to Sound Generation via Unsupervised Information Disentanglement

no code implementations • 19 Nov 2022 • Chenye Cui, Yi Ren, Jinglin Liu, Rongjie Huang, Zhou Zhao

In this paper, we pose the task of generating sound with a specific timbre given a video input and a reference audio sample.

Disentanglement

Paper
Add Code

ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech

3 code implementations • 13 Jul 2022 • Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren

Through the preliminary study on diffusion model parameterization, we find that previous gradient-based TTS models require hundreds or thousands of iterations to guarantee high sample quality, which poses a challenge for accelerating sampling.

Denoising Knowledge Distillation +3

686

Paper
Code

GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech

2 code implementations • 15 May 2022 • Rongjie Huang, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao

Style transfer for out-of-domain (OOD) speech synthesis aims to generate speech samples with unseen style (e. g., speaker identity, emotion, and prosody) derived from an acoustic reference, while facing the following challenges: 1) The highly dynamic style features in expressive voice are difficult to model and transfer; and 2) the TTS models should be robust enough to handle diverse OOD conditions that differ from the source data.

Speech Synthesis Style Transfer +1

306

Paper
Code

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

2 code implementations • MM '21: Proceedings of the 29th ACM International Conference on Multimedia 2021 • Rongjie Huang, Feiyang Chen, Yi Ren, Jinglin Liu, Chenye Cui, Zhou Zhao

High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the singing voice data shortage, limited singer generalization, and large computational cost.

Audio Generation Singing Voice Synthesis +1

138

Paper
Code

SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation

no code implementations • 14 Oct 2021 • Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang

In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis.

Generative Adversarial Network Singing Voice Synthesis +2

Paper
Add Code

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

no code implementations • 17 Jun 2021 • Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao

Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.

Emotional Speech Synthesis Emotion Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.