Search Results for author: Guangyan Zhang

Found 10 papers, 1 papers with code

Creating Personalized Synthetic Voices from Post-Glossectomy Speech with Guided Diffusion Models

no code implementations27 May 2023 Yusheng Tian, Guangyan Zhang, Tan Lee

Specifically, a diffusion-based speech synthesis model is trained on original recordings, to capture and preserve the target speaker's original articulation style.

Speech Synthesis Voice Conversion

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech

no code implementations31 Mar 2022 Guangyan Zhang, Kaitao Song, Xu Tan, Daxin Tan, Yuzi Yan, Yanqing Liu, Gang Wang, Wei Zhou, Tao Qin, Tan Lee, Sheng Zhao

However, the works apply pre-training with character-based units to enhance the TTS phoneme encoder, which is inconsistent with the TTS fine-tuning that takes phonemes as input.

A study on the efficacy of model pre-training in developing neural text-to-speech system

no code implementations8 Oct 2021 Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee

However, in terms of ultimately achieved system performance for target speaker(s), the actual benefits of model pre-training are uncertain and unstable, depending very much on the quantity and text content of training data.

Computational Efficiency

Environment Aware Text-to-Speech Synthesis

no code implementations8 Oct 2021 Daxin Tan, Guangyan Zhang, Tan Lee

The key idea is to model the acoustic environment in speech audio as a factor of data variability and incorporate it as a condition in the process of neural network based speech synthesis.

Attribute Disentanglement +2

Applying the Information Bottleneck Principle to Prosodic Representation Learning

no code implementations5 Aug 2021 Guangyan Zhang, Ying Qin, Daxin Tan, Tan Lee

This paper describes a novel design of a neural network-based speech generation model for learning prosodic representation. The problem of representation learning is formulated according to the information bottleneck (IB) principle.

Representation Learning

AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style

no code implementations6 Jul 2021 Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu

While recent text to speech (TTS) models perform very well in synthesizing reading-style (e. g., audiobook) speech, it is still challenging to synthesize spontaneous-style speech (e. g., podcast or conversation), mainly because of two reasons: 1) the lack of training data for spontaneous speech; 2) the difficulty in modeling the filled pauses (um and uh) and diverse rhythms in spontaneous speech.

CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge

no code implementations8 Mar 2021 Daxin Tan, Hingpang Huang, Guangyan Zhang, Tan Lee

100 and 5 utterances of 3 target speakers in different voice and style are provided in track 1 and 2 respectively, and the participants are required to synthesize speech in target speaker's voice and style.

Voice Cloning

ItLnc-BXE: a Bagging-XGBoost-ensemble method with multiple features for identification of plant lncRNAs

1 code implementation1 Nov 2019 Guangyan Zhang, Ziru Liu, Jichen Dai, Zilan Yu, Shuai Liu, Wen Zhang

However, most of the existing methods are designed for lncRNAs in animal systems, and only a few methods focus on the plant lncRNA identification.

Ensemble Learning feature selection

Cannot find the paper you are looking for? You can Submit a new open access paper.