Search Results for author: Deyi Tuo

Found 7 papers, 3 papers with code

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

no code implementations31 Aug 2023 Shaohuan Zhou, Shun Lei, Weiya You, Deyi Tuo, Yuren You, Zhiyong Wu, Shiyin Kang, Helen Meng

This paper presents an end-to-end high-quality singing voice synthesis (SVS) system that uses bidirectional encoder representation from Transformers (BERT) derived semantic embeddings to improve the expressiveness of the synthesized singing voice.

Singing Voice Synthesis

Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

no code implementations31 Aug 2023 Jie Chen, Changhe Song, Deyi Tuo, Xixin Wu, Shiyin Kang, Zhiyong Wu, Helen Meng

For text-to-speech (TTS) synthesis, prosodic structure prediction (PSP) plays an important role in producing natural and intelligible speech.

Multi-Task Learning

CoverHunter: Cover Song Identification with Refined Attention and Alignments

1 code implementation15 Jun 2023 Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han

Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track.

Cover song identification

FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement

2 code implementations23 Mar 2022 Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng

Previously proposed FullSubNet has achieved outstanding performance in Deep Noise Suppression (DNS) Challenge and attracted much attention.

Speech Enhancement

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

no code implementations20 Jun 2020 Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng

Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.

Talking Head Generation

DurIAN: Duration Informed Attention Network For Multimodal Synthesis

4 code implementations4 Sep 2019 Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu

In this paper, we present a generic and robust multimodal synthesis system that produces highly natural speech and facial expression simultaneously.

Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.