1 code implementation • 7 Jan 2025 • Zhe Li, Man-Wai Mak, Mert Pilanci, Hung-Yi Lee, Helen Meng
Previous research has shown that the principal singular vectors of a pre-trained model's weight matrices capture critical knowledge.
no code implementations • 7 Jan 2025 • Jinchao Li, Yuejiao Wang, Junan Li, Jiawen Kang, Bo Zheng, Simon Wong, Brian Mak, Helene Fung, Jean Woo, Man-Wai Mak, Timothy Kwok, Vincent Mok, Xianmin Gong, Xixin Wu, Xunying Liu, Patrick Wong, Helen Meng
DTM-based approach validated the effectiveness of dynamic topic consistency as a macrostructural metric (F1=0. 61, AUC=0. 78).
no code implementations • 11 Dec 2024 • Junjie Li, Ke Zhang, Shuai Wang, Kong Aik Lee, Man-Wai Mak, Haizhou Li
Audio-visual Target Speaker Extraction (AV-TSE) aims to isolate the speech of a specific target speaker from an audio mixture using time-synchronized visual cues.
no code implementations • 1 Mar 2024 • Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee
This forces the model to learn a speaker distribution disentangled from the semantic content.
no code implementations • 27 Nov 2023 • Zezhong Jin, Youzhi Tu, Man-Wai Mak
The intuition is that phonetic information can preserve low-level acoustic dynamics with speaker information and thus partly compensate for the degradation due to noise and reverberation.
no code implementations • 23 Sep 2023 • Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien
Contrastive speaker embedding assumes that the contrast between the positive and negative pairs of speech segments is attributed to speaker identity only.
no code implementations • 8 Sep 2023 • Chong-Xin Gan, Man-Wai Mak, Weiwei Lin, Jen-Tzung Chien
Contrastive self-supervised learning (CSL) for speaker verification (SV) has drawn increasing interest recently due to its ability to exploit unlabeled data.
1 code implementation • 18 Aug 2023 • Chongkai Lu, Man-Wai Mak, Ruimin Li, Zheru Chi, Hong Fu
The framework locates actions in videos by detecting the action evolution process.
no code implementations • 14 May 2023 • Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu
Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 28 Mar 2023 • Haiquan Mao, Feng Hong, Man-Wai Mak
Inspired by the self-training strategies that use an existing classifier to label the unlabeled data for retraining, we propose a cluster-guided UDA framework that labels the target domain data by clustering and combines the labeled source domain data and pseudo-labeled target domain data to train a speaker embedding network.
no code implementations • 29 Oct 2022 • Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng
The challenges in applying contrastive learning to speaker verification (SV) are that the softmax-based contrastive loss lacks discriminative power and that the hard negative pairs can easily influence learning.
1 code implementation • 29 Oct 2022 • Zhe Li, Man-Wai Mak
A great challenge in speaker representation learning using deep models is to design learning objectives that can enhance the discrimination of unseen speakers under unseen domains.
no code implementations • 8 Aug 2017 • Shibiao Wan, Man-Wai Mak, Sun-Yuan Kung
In the post-genomic era, large-scale personal DNA sequences are produced and collected for genetic medical diagnoses and new drug discovery, which, however, simultaneously poses serious challenges to the protection of personal genomic privacy.