Search Results for author: Deyi Tuo

Found 7 papers, 3 papers with code

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

no code implementations • 31 Aug 2023 • Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng

This paper presents an end-to-end high-quality singing voice synthesis (SVS) system that uses bidirectional encoder representation from Transformers (BERT) derived semantic embeddings to improve the expressiveness of the synthesized singing voice.

Singing Voice Synthesis

Paper
Add Code

Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

no code implementations • 31 Aug 2023 • Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng

For text-to-speech (TTS) synthesis, prosodic structure prediction (PSP) plays an important role in producing natural and intelligible speech.

Multi-Task Learning

Paper
Add Code

CoverHunter: Cover Song Identification with Refined Attention and Alignments

1 code implementation • 15 Jun 2023 • Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han

Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track.

Cover song identification

Paper
Code

Disentangleing Content and Fine-grained Prosody Information via Hybrid ASR Bottleneck Features for Voice Conversion

no code implementations • 24 Mar 2022 • Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng

In this paper, we proposed an any-to-one VC method using hybrid bottleneck features extracted from CTC-BNFs and CE-BNFs to complement each other advantages.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement

2 code implementations • 23 Mar 2022 • Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng

Previously proposed FullSubNet has achieved outstanding performance in Deep Noise Suppression (DNS) Challenge and attracted much attention.

Speech Enhancement

211

Paper
Code

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

no code implementations • 20 Jun 2020 • Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng

Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.

Talking Head Generation

Paper
Add Code

DurIAN: Duration Informed Attention Network For Multimodal Synthesis

4 code implementations • 4 Sep 2019 • Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu

In this paper, we present a generic and robust multimodal synthesis system that produces highly natural speech and facial expression simultaneously.

Speech Synthesis

181

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.