Search Results for author: Genshun Wan

Found 9 papers, 1 papers with code

Lightweight Transducer Based on Frame-Level Criterion

1 code implementation5 Sep 2024 Genshun Wan, Mengzhi Wang, Tingzhi Mao, Hang Chen, Zhongfu Ye

The transducer model trained based on sequence-level criterion requires a lot of memory due to the generation of the large probability matrix.

Decoder imbalanced classification +1

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

no code implementations28 Aug 2023 Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios.

speaker-diarization Speaker Diarization +2

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation

no code implementations6 Dec 2022 Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu

AV2vec has a student and a teacher module, in which the student performs a masked latent feature regression task using the multimodal target features generated online by the teacher.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.