no code implementations • 21 Feb 2025 • Weiqiao Shan, Yuang Li, Yuhao Zhang, Yingfeng Luo, Chen Xu, Xiaofeng Zhao, Long Meng, Yunfei Lu, Min Zhang, Hao Yang, Tong Xiao, Jingbo Zhu
Connecting audio encoders with large language models (LLMs) allows the LLM to perform various audio understanding tasks, such as automatic speech recognition (ASR) and audio captioning (AC).
1 code implementation • 14 Jan 2025 • Weiqiao Shan, Yuhao Zhang, Yuchen Han, Bei Li, Xiaofeng Zhao, Yuang Li, Min Zhang, Hao Yang, Tong Xiao, Jingbo Zhu
Recent advancements have highlighted the efficacy of self-supervised learning (SSL) features in various speech-related tasks, providing lightweight and versatile multi-view speech representations.
no code implementations • 2 Dec 2024 • Yuhe Ji, Yilun Liu, Feiyu Yao, Minggui He, Shimin Tao, Xiaofeng Zhao, Su Chang, Xinhua Yang, Weibin Meng, Yuming Xie, Boxing Chen, Hao Yang
The increasing complexity of computer systems necessitates innovative approaches to fault and error management, going beyond traditional manual log analysis.
no code implementations • 20 Nov 2024 • Jiawei Yu, Yuang Li, Xiaosong Qiao, Huan Zhao, Xiaofeng Zhao, Wei Tang, Min Zhang, Hao Yang, Jinsong Su
Existing research primarily utilizes additional text data and predefined speech styles supported by TTS models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 20 Sep 2024 • Yuang Li, Xiaosong Qiao, Xiaofeng Zhao, Huan Zhao, Wei Tang, Min Zhang, Hao Yang
Large language models can enhance automatic speech recognition systems through generative error correction.
1 code implementation • 22 May 2024 • Xiang Geng, Ming Zhu, Jiahuan Li, Zhejian Lai, Wei Zou, Shuaijie She, Jiaxin Guo, Xiaofeng Zhao, Yinglu Li, Yuang Li, Chang Su, Yanqing Zhao, Xinglin Lyu, Min Zhang, Jiajun Chen, Hao Yang, ShuJian Huang
For the second issue, we propose a method comprising two synergistic components: low-rank adaptation for training to maintain the original LLM parameters, and recovery KD, which utilizes data generated by the chat LLM itself to recover the original knowledge from the frozen parameters.
1 code implementation • 28 Feb 2024 • Yuan Ge, Yilun Liu, Chi Hu, Weibin Meng, Shimin Tao, Xiaofeng Zhao, Hongxia Ma, Li Zhang, Boxing Chen, Hao Yang, Bei Li, Tong Xiao, Jingbo Zhu
Given the significant resource allocation required for training and evaluating models, it is advantageous to have an efficient method for selecting high-quality IT data.
no code implementations • 21 Jan 2024 • Yuang Li, Jiawei Yu, Min Zhang, Mengxin Ren, Yanqing Zhao, Xiaofeng Zhao, Shimin Tao, Jinsong Su, Hao Yang
In this work, we connect the Whisper encoder with ChatGLM3 and provide in-depth comparisons of these two approaches using Chinese automatic speech recognition (ASR) and name entity recognition (NER) tasks.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
2 code implementations • 22 Nov 2023 • Yilun Liu, Shimin Tao, Xiaofeng Zhao, Ming Zhu, Wenbing Ma, Junhao Zhu, Chang Su, Yutai Hou, Miao Zhang, Min Zhang, Hongxia Ma, Li Zhang, Hao Yang, Yanfei Jiang
Instruction tuning is crucial for enabling Language Learning Models (LLMs) in responding to human instructions.
1 code implementation • 9 Dec 2021 • Feiliang Ren, Longhui Zhang, Xiaofeng Zhao, Shujuan Yin, Shilei Liu, Bochao Li
Moreover, experiments show that both the proposed bidirectional extraction framework and the share-aware learning mechanism have good adaptability and can be used to improve the performance of other tagging based methods.
no code implementations • 4 Nov 2021 • Mingao Yuan, Bin Zhao, Xiaofeng Zhao
In practice, a network may has censored (or missing) values and it is shown that censored values have non-negligible effect on the structural properties of a network.
1 code implementation • EMNLP 2021 • Feiliang Ren, Longhui Zhang, Shujuan Yin, Xiaofeng Zhao, Shilei Liu, Bochao Li, Yaduo Liu
Next, the mined global associations are integrated into the table feature of each relation.
1 code implementation • EMNLP 2021 • Shilei Liu, Xiaofeng Zhao, Bochao Li, Feiliang Ren, Longhui Zhang, Shujuan Yin
Neural conversation models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge.
no code implementations • 31 Aug 2021 • Shilei Liu, Xiaofeng Zhao, Bochao Li, Feiliang Ren
Knowledge-grounded dialogue is a task of generating a fluent and informative response based on both conversation context and a collection of external knowledge, in which knowledge selection plays an important role and attracts more and more research interest.
1 code implementation • 20 Aug 2021 • Feiliang Ren, Longhui Zhang, Shujuan Yin, Xiaofeng Zhao, Shilei Liu, Bochao Li
Tagging based methods are one of the mainstream methods in relational triple extraction.
1 code implementation • 16 Aug 2021 • Yaduo Liu, Longhui Zhang, Shujuan Yin, Xiaofeng Zhao, Feiliang Ren
Finally, our system ranks No. 4 on the test set leader-board of this multi-format information extraction task, and its F1 scores for the subtasks of relation extraction, event extractions of sentence-level and document-level are 79. 887%, 85. 179%, and 70. 828% respectively.