no code implementations • 20 Feb 2023 • Leyuan Qu, Cornelius Weber, Stefan Wermter
Furthermore, our proposed combined loss rescaling and weight consolidation methods can support continual learning of an ASR system.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 14 Dec 2022 • Leyuan Qu, Taihao Li, Cornelius Weber, Theresa Pekarek-Rosin, Fuji Ren, Stefan Wermter
Human speech can be characterized by different components, including semantic content, speaker identity and prosodic information.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 16 Nov 2022 • Leyuan Qu, Wei Wang, Cornelius Weber, Pengcheng Yue, Taihao Li, Stefan Wermter
Once training is completed, EmoAug enriches expressions of emotional speech with different prosodic attributes, such as stress, rhythm and intensity, by feeding different styles into the paralinguistic encoder.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • LREC 2022 • Gerald Schwiebert, Cornelius Weber, Leyuan Qu, Henrique Siqueira, Stefan Wermter
Large datasets as required for deep learning of lip reading do not exist in many languages.
no code implementations • 9 Dec 2021 • Leyuan Qu, Cornelius Weber, Stefan Wermter
The aim of this work is to investigate the impact of crossmodal self-supervised pre-training for speech reconstruction (video-to-audio) by leveraging the natural co-occurrence of audio and visual streams in videos.
no code implementations • 17 May 2020 • Leyuan Qu, Cornelius Weber, Stefan Wermter
Target speech separation refers to isolating target speech from a multi-speaker mixture signal by conditioning on auxiliary information about the target speaker.
Audio and Speech Processing Sound