Search Results for author: Genshun Wan

Found 6 papers, 0 papers with code

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

no code implementations • 28 Aug 2023 • Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios.

speaker-diarization Speaker Diarization +2

Paper
Add Code

Reducing the gap between streaming and non-streaming Transducer-based ASR by adaptive two-stage knowledge distillation

no code implementations • 27 Jun 2023 • Haitao Tang, Yu Fu, Lei Sun, Jiabin Xue, Dan Liu, Yongchao Li, Zhiqiang Ma, Minghui Wu, Jia Pan, Genshun Wan, Ming'en Zhao

In this paper, we propose an adaptive two-stage knowledge distillation method consisting of hidden layer learning and output layer learning.

Knowledge Distillation speech-recognition +1

Paper
Add Code

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

no code implementations • 7 Dec 2022 • Fenglin Ding, Genshun Wan, Pengcheng Li, Jia Pan, Cong Liu

Multilingual end-to-end models have shown great improvement over monolingual systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit

no code implementations • 7 Dec 2022 • Pengcheng Li, Genshun Wan, Fenglin Ding, Hang Chen, Jianqing Gao, Jia Pan, Cong Liu

Speech pre-training has shown great success in learning useful and general latent representations from large-scale unlabeled data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition

no code implementations • 7 Dec 2022 • Genshun Wan, Tan Liu, Hang Chen, Jia Pan, Cong Liu, Zhongfu Ye

Self-supervised learning (SSL) models have achieved considerable improvements in automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation

no code implementations • 6 Dec 2022 • Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu

AV2vec has a student and a teacher module, in which the student performs a masked latent feature regression task using the multimodal target features generated online by the teacher.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.