no code implementations • 2 Jun 2023 • Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan
Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks.
no code implementations • 13 Oct 2022 • Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan
Labeled audio data is insufficient to build satisfying speech recognition systems for most of the languages in the world.
no code implementations • 26 Apr 2022 • Siqi Zheng, Hongbin Suo
In this paper we propose to view clustering-based diarization as a community detection problem.
no code implementations • 25 Apr 2022 • Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li
In this work, we present a GCN-based approach for semi-supervised learning.
no code implementations • 9 Sep 2021 • Siqi Zheng, Shiliang Zhang, Weilong Huang, Qian Chen, Hongbin Suo, Ming Lei, Jinwei Feng, Zhijie Yan
We propose BeamTransformer, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling.
no code implementations • 20 Jul 2021 • Siqi Zheng, Weilong Huang, Xianliang Wang, Hongbin Suo, Jinwei Feng, Zhijie Yan
In this paper we describe a speaker diarization system that enables localization and identification of all speakers present in a conversation or meeting.