no code implementations • 29 Sep 2024 • Yiwei Li, Jiayi Shi, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
In this work, we introduce a new concept, instruction embedding, and construct Instruction Embedding Benchmark (IEB) for its training and evaluation.
no code implementations • 26 Aug 2024 • Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
With the increase in available context length of LLMs, recent experiments have shown that the performance of ICL does not necessarily scale well in many-shot (demonstration) settings.
no code implementations • 25 Aug 2024 • Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
To alleviate the insufficiencies of the conditions in reality, we further introduce an algorithm that treats humans (when available) and the models under evaluation as reference models, alternately conducting model weights calibration and filtering during E-step and M-step.
no code implementations • 24 Aug 2024 • Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
To demonstrate the effectiveness of DSC, we conduct extensive experiments on three popular categories of reasoning tasks: arithmetic, commonsense and symbolic reasoning on six benchmarks.
no code implementations • 17 Aug 2024 • Xinglin Wang, Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
As Large Language Models (LLMs) have recently shown remarkable abilities across a wide variety of tasks, we are curious about the cognitive levels of current LLMs: to what extent they have developed and how this development has been achieved.
1 code implementation • 2 Jul 2024 • Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, HeDa Wang, Yao Hu, Kan Li
These methods, however, face limitations due to their inability to fully utilize the nuanced consensus knowledge present within multiple candidate samples, often resulting in suboptimal outputs.
1 code implementation • 19 Jan 2024 • Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, HeDa Wang, Xupeng Miao, Kan Li
Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents.
1 code implementation • 19 Jan 2024 • Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, HeDa Wang, Kan Li
Self-consistency (SC) has been a widely used decoding strategy for chain-of-thought reasoning.
1 code implementation • 31 Dec 2023 • Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, HeDa Wang, Kan Li
Significant progress has been made in automatic text evaluation with the introduction of large language models (LLMs) as evaluators.
1 code implementation • 20 Dec 2023 • Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, HeDa Wang, Kan Li
In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones.
no code implementations • 26 Jul 2021 • Wentian Zhao, Yao Hu, HeDa Wang, Xinxiao wu, Jiebo Luo
Entity-aware image captioning aims to describe named entities and events related to the image by utilizing the background knowledge in the associated article.