no code implementations • 27 Nov 2023 • Zezhong Jin, Youzhi Tu, Man-Wai Mak
The intuition is that phonetic information can preserve low-level acoustic dynamics with speaker information and thus partly compensate for the degradation due to noise and reverberation.
no code implementations • 23 Sep 2023 • Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien
Contrastive speaker embedding assumes that the contrast between the positive and negative pairs of speech segments is attributed to speaker identity only.
no code implementations • 14 May 2023 • Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu
Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2