no code implementations • 5 Jun 2023 • Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang, Binghuai Lin
Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence.
no code implementations • 3 May 2023 • Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, JianHua Tao, Jianqing Sun, Jiaen Liang
However, it is still a challenge to comprehensively model the conversation, and a majority of conversational TTS systems only focus on extracting global information and omit local prosody features, which contain important fine-grained information like keywords and emphasis.
1 code implementation • 20 Mar 2022 • Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun, Jiaen Liang
In recent years, neural network based methods for multi-speaker text-to-speech synthesis (TTS) have made significant progress.
no code implementations • 4 Mar 2022 • Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang
In recent years, exploring effective sound separation (SSep) techniques to improve overlapping sound event detection (SED) attracts more and more attention.
no code implementations • 26 Mar 2021 • Tiantian Tang, Xinyuan Zhou, Yanhua Long, Yijie Li, Jiaen Liang
Domain mismatch is a noteworthy issue in acoustic event detection tasks, as the target domain data is difficult to access in most real applications.
no code implementations • 23 Mar 2021 • Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang, Yuping Wang
A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously.
no code implementations • 19 Oct 2020 • Jiangyu Han, Wei Rao, Yanhua Long, Jiaen Liang
Furthermore, by introducing a mixture embedding matrix pooling method, our proposed attention-based scaling adaptation (ASA) can exploit the target speaker clues in a more efficient way.