no code implementations • 3 Jan 2025 • Cunhang Fan, Sheng Zhang, Jingjing Zhang, Zexu Pan, Zhao Lv
Decoding speech from brain signals is a challenging research problem that holds significant importance for studying speech processing in the brain.
1 code implementation • 16 Dec 2024 • Yujie Chen, Jiangyan Yi, Cunhang Fan, JianHua Tao, Yong Ren, Siding Zeng, Chu Yuan Zhang, Xinrui Yan, Hao Gu, Jun Xue, Chenglong Wang, Zhao Lv, Xiaohui Zhang
To address this issue, we propose a continual learning method named Region-Based Optimization (RegO) for audio deepfake detection.
1 code implementation • 15 Oct 2024 • Sheng Yan, Cunhang Fan, Hongyu Zhang, Xiaoke Yang, JianHua Tao, Zhao Lv
To address these issues, this paper proposes a dual attention refinement network with spatiotemporal construction for AAD, named DARNet, which consists of the spatiotemporal construction module, dual attention refinement module, and feature fusion \& classifier module.
no code implementations • 10 Oct 2024 • Zhanyue Qin, Haochuan Wang, Zecheng Wang, Deyuan Liu, Cunhang Fan, Zhao Lv, Zhiying Tu, Dianhui Chu, Dianbo Sui
At the same time, the experimental results show that, considering both the gender bias of the model and its general code generation capability, MG-Editing is most effective when applied at the row and neuron levels of granularity.
1 code implementation • 20 Sep 2024 • Haoyin Yan, Jie Zhang, Cunhang Fan, Yeping Zhou, Peiqi Liu
Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility.
no code implementations • 24 Jun 2024 • Deyuan Liu, Zhanyue Qin, Hairu Wang, Zhao Yang, Zecheng Wang, Fangying Rong, Qingbin Liu, Yanchao Hao, Xi Chen, Cunhang Fan, Zhao Lv, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui
While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments.
no code implementations • 24 Jun 2024 • Zhanyue Qin, Haochuan Wang, Deyuan Liu, Ziyang Song, Cunhang Fan, Zhao Lv, Jinlin Wu, Zhen Lei, Zhiying Tu, Dianhui Chu, Xiaoyan Yu, Dianbo Sui
In order to answer this question, we propose the UNO Arena based on the card game UNO to evaluate the sequential decision-making capability of LLMs and explain in detail why we choose UNO.
1 code implementation • 19 Jan 2024 • Cunhang Fan, Yujie Chen, Jun Xue, Yonghui Kong, JianHua Tao, Zhao Lv
This paper proposes a progressive distillation method based on masked generation features for KGC task, aiming to significantly reduce the complexity of pre-trained models.
no code implementations • 7 Sep 2023 • Cunhang Fan, Hongyu Zhang, Wei Huang, Jun Xue, JianHua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu
Specifically, to effectively represent the non-Euclidean properties of EEG signals, dynamical graph convolutional networks are applied to represent the graph structure of EEG signals, which can also extract crucial features related to auditory spatial attention in EEG signals.
no code implementations • 27 Jun 2023 • Shunbo Dong, Jun Xue, Cunhang Fan, Kang Zhu, Yujie Chen, Zhao Lv
The main purpose of this system is to improve the model's ability to learn precise forgery information for FSD task in low-quality scenarios.
no code implementations • 2 Mar 2023 • Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv
To address this problem, we propose using the deepest network instruct shallow network for enhancing shallow networks.
2 code implementations • 11 Nov 2022 • Jiangyan Yi, Chenglong Wang, JianHua Tao, Chu Yuan Zhang, Cunhang Fan, Zhengkun Tian, Haoxin Ma, Ruibo Fu
Some scene fake audio detection benchmark results on the SceneFake dataset are reported in this paper.
no code implementations • 20 Aug 2022 • Chenglong Wang, Jiangyan Yi, JianHua Tao, Haiyang Sun, Xun Chen, Zhengkun Tian, Haoxin Ma, Cunhang Fan, Ruibo Fu
The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure.
no code implementations • 2 Aug 2022 • Jun Xue, Cunhang Fan, Zhao Lv, JianHua Tao, Jiangyan Yi, Chengshi Zheng, Zhengqi Wen, Minmin Yuan, Shegang Shao
Meanwhile, to make full use of the phase and full-band information, we also propose to use real and imaginary spectrogram features as complementary input features and model the disjoint subbands separately.
no code implementations • 17 Feb 2022 • Jiangyan Yi, Ruibo Fu, JianHua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Xiaohui Zhang, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li, Zheng Lian, Bin Liu
Audio deepfake detection is an emerging topic, which was included in the ASVspoof 2021.
1 code implementation • 16 Jul 2021 • Hao Chen, Ming Jin, Zhunan Li, Cunhang Fan, Jinpeng Li, Huiguang He
Although several studies have adopted domain adaptation (DA) approaches to tackle this problem, most of them treat multiple EEG data from different subjects and sessions together as a single source domain for transfer, which either fails to satisfy the assumption of domain adaptation that the source has a certain marginal distribution, or increases the difficulty of adaptation.
no code implementations • 11 Nov 2020 • Cunhang Fan, Bin Liu, JianHua Tao, Jiangyan Yi, Zhengqi Wen, Leichao Song
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
no code implementations • 9 Nov 2020 • Cunhang Fan, Jiangyan Yi, JianHua Tao, Zhengkun Tian, Bin Liu, Zhengqi Wen
The joint training framework for speech enhancement and recognition methods have obtained quite good performances for robust end-to-end automatic speech recognition (ASR).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • Pattern Recognition 2020 • Bocheng Zhao, JianHua Tao, Minghao Yang, Zhengkun Tian, Cunhang Fan, Ye Bai
Calligraphy imitation (CI) from a handful of target handwriting samples is such a challenging task that most of the existing writing style analysis or handwriting generation methods do not exhibit satisfactory performance.
no code implementations • 6 Apr 2020 • Cunhang Fan, Jian-Hua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen
In this paper, we propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features, which is based on the deep clustering (DC).
no code implementations • 1 Apr 2020 • Jiangyan Yi, Jian-Hua Tao, Ye Bai, Zhengkun Tian, Cunhang Fan
The other is that POS tags are provided by an external POS tagger.
no code implementations • 17 Mar 2020 • Cunhang Fan, Jian-Hua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu
Secondly, to pay more attention to the outputs of the pre-separation stage, an attention module is applied to acquire deep attention fusion features, which are extracted by computing the similarity between the mixture and the pre-separated speech.
no code implementations • 5 Feb 2020 • Cunhang Fan, Bin Liu, Jian-Hua Tao, Jiangyan Yi, Zhengqi Wen
Specifically, we apply the deep clustering network to extract deep embedding features.
no code implementations • 23 Jul 2019 • Cunhang Fan, Bin Liu, Jian-Hua Tao, Jiangyan Yi, Zhengqi Wen
Firstly, a DC network is trained to extract deep embedding features, which contain each source's information and have an advantage in discriminating each target speakers.