Search Results for author: Dongyang Dai

Found 8 papers, 1 papers with code

learning discriminative features from spectrograms using center loss for speech emotion recognition

no code implementations2 Jan 2025 Dongyang Dai, Zhiyong Wu, Runnan Li, Xixin Wu, Jia Jia, Helen Meng

Identifying the emotional state from speech is essential for the natural interaction of the machine with the speaker.

Speech Emotion Recognition

Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT

no code implementations2 Jan 2025 Dongyang Dai, Zhiyong Wu, Shiyin Kang, Xixin Wu, Jia Jia, Dan Su, Dong Yu, Helen Meng

The pre-trained BERT model extracts semantic features from a raw Chinese character sequence and the NN based classifier predicts the polyphonic character's pronunciation according to BERT output.

Polyphone disambiguation Sentence +1

Multi-modal Adversarial Training for Zero-Shot Voice Cloning

no code implementations28 Aug 2024 John Janiczek, Dading Chong, Dongyang Dai, Arlo Faria, Chao Wang, Tao Wang, Yuzong Liu

The discriminator is used in a training pipeline that improves both the acoustic and prosodic features of a TTS model.

Decoder Text to Speech +1

RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

1 code implementation8 Mar 2024 Peng Liu, Dongyang Dai, Zhiyong Wu

Recent advancements in generative modeling have significantly enhanced the reconstruction of audio waveforms from various representations.

Audio Generation Computational Efficiency +1

Unsupervised Cross-Lingual Speech Emotion Recognition Using DomainAdversarial Neural Network

no code implementations21 Dec 2020 Xiong Cai, Zhiyong Wu, Kuo Zhong, Bin Su, Dongyang Dai, Helen Meng

By using deep learning approaches, Speech Emotion Recog-nition (SER) on a single domain has achieved many excellentresults.

Speech Emotion Recognition

Speaker Independent and Multilingual/Mixlingual Speech-Driven Talking Head Generation Using Phonetic Posteriorgrams

no code implementations20 Jun 2020 Huirong Huang, Zhiyong Wu, Shiyin Kang, Dongyang Dai, Jia Jia, Tianxiao Fu, Deyi Tuo, Guangzhi Lei, Peng Liu, Dan Su, Dong Yu, Helen Meng

Recent approaches mainly have following limitations: 1) most speaker-independent methods need handcrafted features that are time-consuming to design or unreliable; 2) there is no convincing method to support multilingual or mixlingual speech as input.

Talking Head Generation

Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement

no code implementations26 May 2020 Dongyang Dai, Li Chen, Yu-Ping Wang, Mu Wang, Rui Xia, Xuchen Song, Zhiyong Wu, Yuxuan Wang

Firstly, the speech synthesis model is pre-trained with both multi-speaker clean data and noisy augmented data; then the pre-trained model is adapted on noisy low-resource new speaker data; finally, by setting the clean speech condition, the model can synthesize the new speaker's clean voice.

Decoder Speech Enhancement +1

Cannot find the paper you are looking for? You can Submit a new open access paper.