Search Results for author: Leyuan Qu

Found 6 papers, 0 papers with code

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

no code implementations • 20 Feb 2023 • Leyuan Qu, Cornelius Weber, Stefan Wermter

Furthermore, our proposed combined loss rescaling and weight consolidation methods can support continual learning of an ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Disentangling Prosody Representations with Unsupervised Speech Reconstruction

no code implementations • 14 Dec 2022 • Leyuan Qu, Taihao Li, Cornelius Weber, Theresa Pekarek-Rosin, Fuji Ren, Stefan Wermter

Human speech can be characterized by different components, including semantic content, speaker identity and prosodic information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer

no code implementations • 16 Nov 2022 • Leyuan Qu, Wei Wang, Cornelius Weber, Pengcheng Yue, Taihao Li, Stefan Wermter

Once training is completed, EmoAug enriches expressions of emotional speech with different prosodic attributes, such as stress, rhythm and intensity, by feeding different styles into the paralinguistic encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

A Multimodal German Dataset for Automatic Lip Reading Systems and Transfer Learning

no code implementations • LREC 2022 • Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira, Stefan Wermter

Large datasets as required for deep learning of lip reading do not exist in many languages.

Lip Reading Transfer Learning

Paper
Add Code

LipSound2: Self-Supervised Pre-Training for Lip-to-Speech Reconstruction and Lip Reading

no code implementations • 9 Dec 2021 • Leyuan Qu, Cornelius Weber, Stefan Wermter

The aim of this work is to investigate the impact of crossmodal self-supervised pre-training for speech reconstruction (video-to-audio) by leveraging the natural co-occurrence of audio and visual streams in videos.

Lip Reading speech-recognition +1

Paper
Add Code

Multimodal Target Speech Separation with Voice and Face References

no code implementations • 17 May 2020 • Leyuan Qu, Cornelius Weber, Stefan Wermter

Target speech separation refers to isolating target speech from a multi-speaker mixture signal by conditioning on auxiliary information about the target speaker.

Audio and Speech Processing Sound

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.