no code implementations • 7 Apr 2024 • Yuang Li, Min Zhang, Mengxin Ren, Miaomiao Ma, Daimeng Wei, Hao Yang
Audio deepfake detection (ADD) is essential for preventing the misuse of synthetic voices that may infringe on personal rights and privacy.
no code implementations • 21 Jan 2024 • Yuang Li, Jiawei Yu, Yanqing Zhao, Min Zhang, Mengxin Ren, Xiaofeng Zhao, Xiaosong Qiao, Chang Su, Miaomiao Ma, Hao Yang
In this work, we connect the Whisper encoder with ChatGLM3 and provide in-depth comparisons of these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 18 Sep 2023 • Yuang Li, Yinglu Li, Min Zhang, Chang Su, Mengxin Ren, Xiaosong Qiao, Xiaofeng Zhao, Mengyao Piao, Jiawei Yu, Xinglin Lv, Miaomiao Ma, Yanqing Zhao, Hao Yang
End-to-end automatic speech recognition (ASR) systems often struggle to recognize rare name entities, such as personal names, organizations, and terminologies not frequently encountered in the training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3