1 code implementation • 25 Feb 2024 • Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi
Recently, \textsc{FuseLLM} introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.
no code implementations • 21 Feb 2024 • Xueliang Zhao, Xinting Huang, Tingchen Fu, Qintong Li, Shansan Gong, Lemao Liu, Wei Bi, Lingpeng Kong
Multimodal reasoning stands as a pivotal capability for large vision-language models (LVLMs).
1 code implementation • 19 Jan 2024 • Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi
While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as \emph{hallucination}.
1 code implementation • 19 Jan 2024 • Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi
In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.
1 code implementation • 16 Jan 2024 • Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
We present Inferflow, an efficient and highly configurable inference engine for large language models (LLMs).
no code implementations • 31 Oct 2023 • Xinting Huang, Jiajing Wan, Ioannis Kritikos, Nora Hollenstein
Humans read texts at a varying pace, while machine learning models treat each token in the same way in terms of a computational process.
no code implementations • 19 Oct 2023 • Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong
Large Language Models (LLMs) have driven substantial progress in artificial intelligence in recent years, exhibiting impressive capabilities across a wide range of tasks, including mathematical problem-solving.
1 code implementation • 13 Oct 2023 • Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi
Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.
no code implementations • 11 Sep 2023 • Yongrui Chen, Haiyun Jiang, Xinting Huang, Shuming Shi, Guilin Qi
High-quality instruction-tuning data is critical to improving LLM capabilities.
1 code implementation • 3 Sep 2023 • Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge.
1 code implementation • 15 Jun 2023 • Chenyang Lyu, Minghao Wu, Longyue Wang, Xinting Huang, Bingshuai Liu, Zefeng Du, Shuming Shi, Zhaopeng Tu
Although instruction-tuned large language models (LLMs) have exhibited remarkable capabilities across various NLP tasks, their effectiveness on other data modalities beyond text has not been fully studied.
1 code implementation • 24 May 2023 • Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao
Multi-party dialogues are more difficult for models to understand than one-to-one two-party dialogues, since they involve multiple interlocutors, resulting in interweaving reply-to relations and information flows.
no code implementations • 22 May 2023 • Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, Wei Bi
Proprietary Large Language Models (LLMs), such as ChatGPT, have garnered significant attention due to their exceptional capabilities in handling a diverse range of tasks.
no code implementations • 3 Aug 2022 • Shuming Shi, Enbo Zhao, Duyu Tang, Yan Wang, Piji Li, Wei Bi, Haiyun Jiang, Guoping Huang, Leyang Cui, Xinting Huang, Cong Zhou, Yong Dai, Dongyang Ma
In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME).
no code implementations • 20 May 2022 • Shiquan Yang, Xinting Huang, Jey Han Lau, Sarah Erfani
Data artifacts incentivize machine learning models to learn non-transferable generalizations by taking advantage of shortcuts in the data, and there is growing evidence that data artifacts play a role for the strong results that deep learning models achieve in recent natural language processing benchmarks.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang
To alleviate the need of action annotations, latent action learning is introduced to map each utterance to a latent representation.
1 code implementation • SEMEVAL 2020 • Jiajing Wan, Xinting Huang
This paper presents our strategies in SemEval 2020 Task 4: Commonsense Validation and Explanation.
no code implementations • ACL 2020 • Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang
This approach requires complete state-action annotations of human-to-human dialogues (i. e., expert demonstrations), which is labor intensive.
no code implementations • 18 Dec 2019 • Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang
These two components, however, have a discrepancy in their objectives, i. e., task completion and language quality.
no code implementations • 3 Aug 2019 • Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang, Hai-Tao Zheng
To model and utilize the context information for aggregated search, we propose a model with context attention and representation learning (CARL).