Search Results for author: Xinfa Zhu

Found 6 papers, 0 papers with code

Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation

no code implementations11 Jun 2024 Hanzhao Li, Liumeng Xue, Haohan Guo, Xinfa Zhu, YuanJun Lv, Lei Xie, Yunlin Chen, Hao Yin, Zhifei Li

The multi-codebook speech codec enables the application of large language models (LLM) in TTS but bottlenecks efficiency and robustness due to multi-sequence prediction.

SELM: Speech Enhancement Using Discrete Tokens and Language Models

no code implementations15 Dec 2023 Ziqian Wang, Xinfa Zhu, Zihan Zhang, YuanJun Lv, Ning Jiang, Guoqing Zhao, Lei Xie

Given the intrinsic similarity between speech generation and speech enhancement, harnessing semantic information holds potential advantages for speech enhancement tasks.

Self-Supervised Learning Speech Enhancement

Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning

no code implementations26 Oct 2023 Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie

This paper aims to build a multi-speaker expressive TTS system, synthesizing a target speaker's speech with multiple styles and emotions.

Contrastive Learning Expressive Speech Synthesis

METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer

no code implementations29 Jul 2023 Xinfa Zhu, Yi Lei, Tao Li, Yongmao Zhang, Hongbin Zhou, Heng Lu, Lei Xie

However, such data-efficient approaches have ignored synthesizing emotional aspects of speech due to the challenges of cross-speaker cross-lingual emotion transfer - the heavy entanglement of speaker timbre, emotion, and language factors in the speech signal will make a system produce cross-lingual synthetic speech with an undesired foreign accent and weak emotion expressiveness.

Disentanglement Diversity +3

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling

no code implementations19 Nov 2022 Xinfa Zhu, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie

This paper aims to synthesize the target speaker's speech with desired speaking style and emotion by transferring the style and emotion from reference speech recorded by other speakers.

Expressive Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.