no code implementations • 26 Mar 2025 • Siyin Wang, Wenyi Yu, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Lu Lu, Yu Tsao, Junichi Yamagishi, Yuxuan Wang, Chao Zhang
To bridge this gap, we introduce QualiSpeech, a comprehensive low-level speech quality assessment dataset encompassing 11 key aspects and detailed natural language comments that include reasoning and contextual insights.
no code implementations • 13 Mar 2025 • Siyin Wang, Zhaoye Fei, Qinyuan Cheng, Shiduo Zhang, Panpan Cai, Jinlan Fu, Xipeng Qiu
Recent advances in large vision-language models (LVLMs) have shown promise for embodied task planning, yet they struggle with fundamental challenges like dependency constraints and efficiency.
no code implementations • 27 Jan 2025 • Chen Chen, Yuchen Hu, Siyin Wang, Helin Wang, Zhehuai Chen, Chao Zhang, Chao-Han Huck Yang, Eng Siong Chng
Recent advances have enabled large language models (LLMs) to incorporate auditory systems for handling various speech-related tasks.
no code implementations • 27 Nov 2024 • Wenyi Yu, Siyin Wang, Xiaoyu Yang, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Guangzhi Sun, Lu Lu, Yuxuan Wang, Chao Zhang
Unlike traditional modularised conversational AI systems, which separate speech recognition, understanding, and text-to-speech generation into distinct components, multimodal LLMs operate as single end-to-end models.
1 code implementation • 25 Sep 2024 • Siyin Wang, Wenyi Yu, Yudong Yang, Changli Tang, Yixuan Li, Jimin Zhuang, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Guangzhi Sun, Lu Lu, Yuxuan Wang, Chao Zhang
The results demonstrate that auditory LLMs achieve competitive performance compared to state-of-the-art task-specific small models in predicting MOS and SIM, while also delivering promising results in A/B testing and natural language descriptions.
no code implementations • 2 Jul 2024 • Yuchen Hu, Chen Chen, Siyin Wang, Eng Siong Chng, Chao Zhang
By leveraging reverse inference as the standard to select exemplars used in RLHF from the speech samples generated by the TTS system itself, RIO steers the subsequent optimization towards a direction of enhancing the TTS robustness.
1 code implementation • 21 Jun 2024 • Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, ShiMin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang
As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount.
no code implementations • 23 Apr 2024 • Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang
Large language models (LLMs) can adapt to new tasks through in-context learning (ICL) based on a few examples presented in dialogue history without any model parameter update.
no code implementations • 5 Mar 2024 • Bo wang, Tianxiang Sun, Hang Yan, Siyin Wang, Qingyuan Cheng, Xipeng Qiu
The exploration of whether agents can align with their environment without relying on human-labeled data presents an intriguing research topic.
no code implementations • 22 Feb 2024 • Siyin Wang, Jie zhou, Qin Chen, Qi Zhang, Tao Gui, Xuanjing Huang
Domain adaption has been widely adapted for cross-domain sentiment analysis to transfer knowledge from the source domain to the target domain.
no code implementations • 17 Feb 2024 • Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang
HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.
no code implementations • 16 Dec 2023 • Jingyi Zhou, Jie zhou, Jiabao Zhao, Siyin Wang, Haijun Shan, Gui Tao, Qi Zhang, Xuanjing Huang
Few-shot text classification has attracted great interest in both academia and industry due to the lack of labeled data in many fields.
3 code implementations • 5 Oct 2023 • Qinyuan Cheng, Tianxiang Sun, Wenwei Zhang, Siyin Wang, Xiangyang Liu, Mozhi Zhang, Junliang He, Mianqiu Huang, Zhangyue Yin, Kai Chen, Xipeng Qiu
We analyze the primary types of hallucinations in different types of models and their causes.
no code implementations • 13 Sep 2023 • Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang
Language-level adaptation experiments using Chinese dialects showed that when applying SICL to isolated word ASR, consistent and considerable relative WER reductions can be achieved using Whisper models of any size on two dialects, which is on average 32. 3%.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • COLING 2022 • Siyin Wang, Jie zhou, Changzhi Sun, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
In this work, we propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV).