no code implementations • 20 Feb 2024 • Yanan Chen, Zihao Cui, Yingying Gao, Junlan Feng, Chao Deng, Shilei Zhang
In this study, we present a novel weighting prediction approach, which explicitly learns the task relationships from downstream training information to address the core challenge of universal speech enhancement.
no code implementations • 23 Oct 2023 • Yingying Gao, Shilei Zhang, Zihao Cui, Chao Deng, Junlan Feng
Cascading multiple pre-trained models is an effective way to compose an end-to-end system.
no code implementations • 20 Oct 2023 • Yingying Gao, Shilei Zhang, Zihao Cui, Yanhan Xu, Chao Deng, Junlan Feng
Self-supervised pre-trained models such as HuBERT and WavLM leverage unlabeled speech data for representation learning and offer significantly improve for numerous downstream tasks.
no code implementations • 12 Jun 2023 • Haiyang Sun, FuLin Zhang, Zheng Lian, Yingying Guo, Shilei Zhang
Additionally, considering that humans adjust their perception of emotional words in textual semantic based on certain cues present in speech, we design a novel search space and search for the optimal fusion strategy for the two types of information.
no code implementations • 26 Jun 2022 • Yingying Gao, Junlan Feng, Chao Deng, Shilei Zhang
Spoken language understanding (SLU) treats automatic speech recognition (ASR) and natural language understanding (NLU) as a unified task and usually suffers from data scarcity.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 16 Jun 2022 • Yingying Gao, Junlan Feng, Tianrui Wang, Chao Deng, Shilei Zhang
Analysis shows that our proposed approach brings a better uniformity for the trained model and enlarges the CTC spikes obviously.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Apr 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang
Joint training of speech enhancement model (SE) and speech recognition model (ASR) is a common solution for robust ASR in noisy environments.
no code implementations • 25 Feb 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Yanan Chen, Junlan Feng, Shilei Zhang
Therefore, we previously proposed a harmonic gated compensation network (HGCN) to predict the full harmonic locations based on the unmasked harmonics and process the result of a coarse enhancement module to recover the masked harmonics.
1 code implementation • 30 Jan 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang
Mask processing in the time-frequency (T-F) domain through the neural network has been one of the mainstreams for single-channel speech enhancement.
no code implementations • 11 Dec 2018 • Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu
Facial expression recognition is a challenging task, arguably because of large intra-class variations and high inter-class similarities.
Facial Expression Recognition Facial Expression Recognition (FER) +1
1 code implementation • 5 Jan 2017 • Weishan Dong, Ting Yuan, Kai Yang, Changsheng Li, Shilei Zhang
In this paper, we study learning generalized driving style representations from automobile GPS trip data.