Search Results for author: Fenglong Xie

Found 5 papers, 2 papers with code

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

no code implementations • 31 Aug 2023 • Haohan Guo, Fenglong Xie, Jiawen Kang, Yujia Xiao, Xixin Wu, Helen Meng

This paper proposes a novel semi-supervised TTS framework, QS-TTS, to improve TTS quality with lower supervised data requirements via Vector-Quantized Self-Supervised Speech Representation Learning (VQ-S3RL) utilizing more unlabeled speech audio.

Representation Learning Speech Synthesis +2

Paper
Add Code

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations

1 code implementation • 27 Oct 2022 • Haohan Guo, Fenglong Xie, Xixin Wu, Hui Lu, Helen Meng

Moreover, we optimize the training strategy by leveraging more audio to learn MSMCRs better for low-resource languages.

Transfer Learning

156

Paper
Code

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS

1 code implementation • 22 Sep 2022 • Haohan Guo, Fenglong Xie, Frank K. Soong, Xixin Wu, Helen Meng

A vector-quantized, variational autoencoder (VQ-VAE) based feature analyzer is used to encode Mel spectrograms of speech training data by down-sampling progressively in multiple stages into MSMC Representations (MSMCRs) with different time resolutions, and quantizing them with multiple VQ codebooks, respectively.

156

Paper
Code

Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS

no code implementations • 28 Sep 2021 • Shilun Lin, Wenchao Su, Li Meng, Fenglong Xie, Xinhui Li, Li Lu

Thirdly, a duration predictor instead of an attention model that connects the above hybrid encoder and decoder.

Paper
Add Code

Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet

no code implementations • 30 Jan 2021 • Shilun Lin, Fenglong Xie, Li Meng, Xinhui Li, Li Lu

In this work, a robust and efficient text-to-speech (TTS) synthesis system named Triple M is proposed for large-scale online application.

Sentence Speech Synthesis +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.