no code implementations • 4 Oct 2024 • Jia Li, Yuqi Zhu, Yongmin Li, Ge Li, Zhi Jin
HonestCoder selectively shows the generated programs to developers based on LLMs' confidence.
no code implementations • 22 Aug 2024 • Xiaohan Wang, Xiaoyan Yang, Yuqi Zhu, Yue Shen, Jian Wang, Peng Wei, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang
Large Language Models (LLMs) like GPT-4, MedPaLM-2, and Med-Gemini achieve performance competitively with human experts across various medical benchmarks.
1 code implementation • 30 May 2024 • Jia Li, Ge Li, YunFei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li
Our experiments reveal these LLMs' coding abilities in real-world code repositories.
1 code implementation • 23 May 2024 • Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning.
1 code implementation • 5 Mar 2024 • Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Ningyu Zhang, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions.
no code implementations • 12 Jan 2024 • Jia Li, Ge Li, YunFei Zhao, Yongmin Li, Zhi Jin, Hao Zhu, Huanyu Liu, Kaibo Liu, Lecheng Wang, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yihong Dong, Yuqi Zhu, Bin Gu, Mengfei Yang
Compared to previous benchmarks, DevEval aligns to practical projects in multiple dimensions, e. g., real program distributions, sufficient dependencies, and enough-scale project contexts.
1 code implementation • 6 Sep 2023 • Yuqi Zhu, Ge Li, YunFei Zhao, Jia Li, Zhi Jin, Hong Mei
With an analysis of loss distributions of code tokens, we find that code tokens can be divided into two categories: challenging tokens that are difficult to predict and confident tokens that can be easily inferred.
1 code implementation • 22 May 2023 • Yuqi Zhu, Xiaohan Wang, Jing Chen, Shuofei Qiao, Yixin Ou, Yunzhi Yao, Shumin Deng, Huajun Chen, Ningyu Zhang
We engage in experiments across eight diverse datasets, focusing on four representative tasks encompassing entity and relation extraction, event extraction, link prediction, and question-answering, thereby thoroughly exploring LLMs' performance in the domain of construction and inference.
2 code implementations • 2 May 2023 • Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
Scaling language models have revolutionized widespread NLP tasks, yet little comprehensively explored few-shot relation extraction with large language models.
1 code implementation • 3 Nov 2022 • Haojie Zhang, Ge Li, Jia Li, Zhongjin Zhang, Yuqi Zhu, Zhi Jin
Large-scale pre-trained language models have achieved impressive results on a wide range of downstream tasks recently.