Search Results for author: Yanhua Long

Found 14 papers, 2 papers with code

UNISOUND System for VoxCeleb Speaker Recognition Challenge 2023

no code implementations • 24 Aug 2023 • Yu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu

Our final system is a fusion of six models and achieves the first place in Track 1 and second place in Track 2 of VoxSRC 2023.

Speaker Recognition

Paper
Add Code

Multi-pass Training and Cross-information Fusion for Low-resource End-to-end Accented Speech Recognition

no code implementations • 20 Jun 2023 • Xuefei Wang, Yanhua Long, Yijie Li, Haoran Wei

Moreover, we propose to train the Aformer in a multi-pass manner, and investigate three cross-information fusion methods to effectively combine the information from both general and accent encoders.

Accented Speech Recognition speech-recognition

Paper
Add Code

Dynamic Acoustic Compensation and Adaptive Focal Training for Personalized Speech Enhancement

no code implementations • 22 Nov 2022 • Xiaofeng Ge, Jiangyu Han, Haixin Guan, Yanhua Long

Recently, more and more personalized speech enhancement systems (PSE) with excellent performance have been proposed.

Speech Enhancement

Paper
Add Code

Phonetic-assisted Multi-Target Units Modeling for Improving Conformer-Transducer ASR system

no code implementations • 3 Nov 2022 • Li Li, Dongxing Xu, Haoran Wei, Yanhua Long

Exploiting effective target modeling units is very important and has always been a concern in end-to-end automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

DiaCorrect: End-to-end error correction for speaker diarization

no code implementations • 31 Oct 2022 • Jiangyu Han, Yuhang Cao, Heng Lu, Yanhua Long

In recent years, speaker diarization has attracted widespread attention.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Heterogeneous Separation Consistency Training for Adaptation of Unsupervised Speech Separation

no code implementations • 23 Apr 2022 • Jiangyu Han, Yanhua Long

SCT follows a framework using two heterogeneous neural networks (HNNs) to produce high confidence pseudo labels of unlabeled real speech mixtures.

Speech Separation

Paper
Add Code

PercepNet+: A Phase and SNR Aware PercepNet for Real-Time Speech Enhancement

no code implementations • 4 Mar 2022 • Xiaofeng Ge, Jiangyu Han, Yanhua Long, Haixin Guan

Finally, we propose to integrate the loss of complex subband gain, SNR, pitch filtering strength, and an OA loss in a multi-objective learning manner to further improve the speech enhancement performance.

Speech Enhancement

Paper
Add Code

Selective Pseudo-labeling and Class-wise Discriminative Fusion for Sound Event Detection

no code implementations • 4 Mar 2022 • Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang

In recent years, exploring effective sound separation (SSep) techniques to improve overlapping sound event detection (SED) attracts more and more attention.

Event Detection Sound Event Detection

Paper
Add Code

DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction

1 code implementation • 27 Dec 2021 • Jiangyu Han, Yanhua Long, Lukas Burget, Jan Cernocky

Particularly, we find that the Mixture-Remix fine-tuning with DPCCN significantly outperforms the TD-SpeakerBeam for unsupervised cross-domain TSE, with around 3. 5 dB SISNR improvement on target domain test set, without any source domain performance degradation.

Speech Extraction

Paper
Code

Improving Channel Decorrelation for Multi-Channel Target Speech Extraction

no code implementations • 6 Jun 2021 • Jiangyu Han, Wei Rao, Yannan Wang, Yanhua Long

Moreover, new combination strategies of the CD-based spatial information and target speaker adaptation of parallel encoder outputs are also investigated.

Speech Extraction

Paper
Add Code

CNN-based Discriminative Training for Domain Compensation in Acoustic Event Detection with Frame-wise Classifier

no code implementations • 26 Mar 2021 • Tiantian Tang, Xinyuan Zhou, Yanhua Long, Yijie Li, Jiaen Liang

Domain mismatch is a noteworthy issue in acoustic event detection tasks, as the target domain data is difficult to access in most real applications.

Event Detection

Paper
Add Code

Joint framework with deep feature distillation and adaptive focal loss for weakly supervised audio tagging and acoustic event detection

no code implementations • 23 Mar 2021 • Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang, Yuping Wang

A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously.

Audio Tagging Event Detection

Paper
Add Code

Multi-channel target speech extraction with channel decorrelation and target speaker adaptation

1 code implementation • 19 Oct 2020 • Jiangyu Han, Xinyuan Zhou, Yanhua Long, Yijie Li

In this work, we propose two methods for exploiting the multi-channel spatial information to extract the target speech.

Speech Extraction Audio and Speech Processing

Paper
Code

Attention-based scaling adaptation for target speech extraction

no code implementations • 19 Oct 2020 • Jiangyu Han, Wei Rao, Yanhua Long, Jiaen Liang

Furthermore, by introducing a mixture embedding matrix pooling method, our proposed attention-based scaling adaptation (ASA) can exploit the target speaker clues in a more efficient way.

Speech Extraction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.