Search Results for author: Hailun Lian

Found 5 papers, 0 papers with code

PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion

no code implementations3 Mar 2024 Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian

In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human perception.

Voice Conversion

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition

no code implementations19 Jan 2024 Yong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn Schuller, Yuan Zong, Wenming Zheng

These segment-level patches are then encoded using a stack of Swin blocks, in which a local window Transformer is utilized to explore local inter-frame emotional information across frame patches of each segment patch.

Speech Emotion Recognition

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

no code implementations18 Jan 2024 Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng

In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers.

Domain Adaptation Speech Emotion Recognition

Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

no code implementations17 Feb 2023 Yan Zhao, Jincen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao

In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled testing (target) speech signals come from different corpora.

Cross-corpus Speech Emotion Recognition +1

Speech Emotion Recognition via an Attentive Time-Frequency Neural Network

no code implementations22 Oct 2022 Cheng Lu, Wenming Zheng, Hailun Lian, Yuan Zong, Chuangao Tang, Sunan Li, Yan Zhao

The F-Encoder and T-Encoder model the correlations within frequency bands and time frames, respectively, and they are embedded into a time-frequency joint learning strategy to obtain the time-frequency patterns for speech emotions.

Speech Emotion Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.