no code implementations • 12 Mar 2025 • Yuhuan You, Xihong Wu, Tianshu Qu
As artificial intelligence-generated content (AIGC) continues to evolve, video-to-audio (V2A) generation has emerged as a key area with promising applications in multimedia editing, augmented reality, and automated content creation.
no code implementations • 13 Sep 2024 • Haolin Zhu, Yujie Yan, Xiran Xu, Zhongshu Ge, Pei Tian, Xihong Wu, Jing Chen
One kind of auditory spatial attention detection (ASAD) method, STAnet, was testified with this ear-EEG database, resulting in 93. 1% in 1-second decoding window.
1 code implementation • 7 Sep 2024 • Donghang Wu, Yiwen Wang, Xihong Wu, Tianshu Qu
In this paper, we propose CrossMamba for target sound extraction, which leverages the hidden attention mechanism of Mamba to compute dependencies between the given clues and the audio mixture.
no code implementations • 7 Sep 2024 • Donghang Wu, Xihong Wu, Tianshu Qu
This paper proposes a method utilizing the mutual facilitation mechanism between sound source localization and separation for moving sources.
no code implementations • 27 May 2024 • Xiran Xu, Bo wang, Boda Xiao, Yadong Niu, Yiwen Wang, Xihong Wu, Jing Chen
Since these EEG data were usually collected with well-designed paradigms in labs, the reliability and robustness of the corresponding decoding methods were doubted by some researchers, and they argued that such decoding accuracy was overestimated due to the inherent temporal autocorrelation of EEG signals.
no code implementations • 10 Jan 2024 • Xiran Xu, Bo wang, Yujie Yan, Haolin Zhu, Zechen Zhang, Xihong Wu, Jing Chen
To investigate the processing of speech in the brain, simple linear models are commonly used to establish a relationship between brain signals and speech features.
1 code implementation • 10 Jan 2024 • Bo wang, Xiran Xu, Zechen Zhang, Haolin Zhu, Yujie Yan, Xihong Wu, Jing Chen
Contrastive learning was used to relate EEG features to speech features.
1 code implementation • 14 Sep 2023 • Xiran Xu, Bo wang, Yujie Yan, Xihong Wu, Jing Chen
ASAD methods are inspired by the brain lateralization of cortical neural responses during the processing of auditory spatial attention, and show promising performance for the task of auditory attention decoding (AAD) with neural recordings.
no code implementations • 10 Oct 2021 • Shan Gao, Xihong Wu, Tianshu Qu
This paper proposes a deconvolution-based network (DCNN) model for DOA estimation of direct source and early reflections under reverberant scenarios.
no code implementations • 3 Mar 2021 • Zhen Fu, Bo wang, Xihong Wu, Jing Chen
In this paper, we proposed novel convolutional recurrent neural network (CRNN) based regression model and classification model, and compared them with both the linear model and the state-of-the-art DNN models.
no code implementations • 3 Mar 2021 • Zhen Fu, Bo wang, Fei Chen, Xihong Wu, Jing Chen
These results indicated the feasibility to estimate eye-gaze with HEOG and NEMG.
no code implementations • 20 Jun 2020 • Yifan Sun, Xihong Wu
The proposed approach works in an analysis-by-synthesis manner to learn an inference network by iteratively sampling and training.
no code implementations • 11 Oct 2016 • Xiangang Li, Xihong Wu
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-art performance on many speech recognition tasks, as they are able to provide the learned dynamically changing contextual window of all sequence history.
no code implementations • 16 Oct 2014 • Xiangang Li, Xihong Wu
Long short-term memory (LSTM) based acoustic modeling methods have recently been shown to give state-of-the-art performance on some speech recognition tasks.