Search Results for author: Kunlun Zhu

Found 13 papers, 11 papers with code

SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents

1 code implementation29 May 2025 Kunlun Zhu, Jiaxun Zhang, Ziheng Qi, Nuoxing Shang, Zijia Liu, Peixuan Han, Yue Su, Haofei Yu, Jiaxuan You

Recent advancements in large language model (LLM) agents have significantly accelerated scientific discovery automation, yet concurrently raised critical ethical and safety concerns.

Adversarial Attack Large Language Model +1

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents

1 code implementation3 Mar 2025 Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Zhe Wang, Zhenhailong Wang, Cheng Qian, Xiangru Tang, Heng Ji, Jiaxuan You

Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents, yet existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition.

ResearchTown: Simulator of Human Research Community

1 code implementation23 Dec 2024 Haofei Yu, Zhaochen Hong, Zirui Cheng, Kunlun Zhu, Keyang Xuan, Jinwei Yao, Tao Feng, Jiaxuan You

Our experiments reveal three key findings: (1) ResearchTown can provide a realistic simulation of collaborative research activities, including paper writing and review writing; (2) ResearchTown can maintain robust simulation with multiple researchers and diverse papers; (3) ResearchTown can generate interdisciplinary research ideas that potentially inspire novel research directions.

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

1 code implementation2 Aug 2024 Kunlun Zhu, Yifan Luo, Dingling Xu, Yukun Yan, Zhenghao Liu, Shi Yu, Ruobing Wang, Shuo Wang, Yishan Li, Nan Zhang, Xu Han, Zhiyuan Liu, Maosong Sun

However, evaluating the effectiveness of RAG systems in specialized scenarios remains challenging due to the high costs of data construction and the lack of suitable evaluation metrics.

Benchmarking Dataset Generation +6

Scaling Large Language Model-based Multi-Agent Collaboration

1 code implementation11 Jun 2024 Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, Zhiyuan Liu, Maosong Sun

Recent breakthroughs in large language model-driven autonomous agents have revealed that multi-agent collaboration often surpasses each individual through collective reasoning.

Language Modeling Language Modelling +2

How Far Are We From AGI: Are LLMs All We Need?

1 code implementation16 May 2024 Tao Feng, Chuanyang Jin, Jingyu Liu, Kunlun Zhu, Haoqin Tu, Zirui Cheng, GuanYu Lin, Jiaxuan You

The evolution of artificial intelligence (AI) has profoundly impacted human society, driving significant advancements in multiple sectors.

All

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science

no code implementations6 Feb 2024 Xiangru Tang, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, Yilun Zhao, Jian Tang, Zhuosheng Zhang, Arman Cohan, Zhiyong Lu, Mark Gerstein

Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines.

QASnowball: An Iterative Bootstrapping Framework for High-Quality Question-Answering Data Generation

no code implementations19 Sep 2023 Kunlun Zhu, Shihao Liang, Xu Han, Zhi Zheng, Guoyang Zeng, Zhiyuan Liu, Maosong Sun

Recent years have witnessed the success of question answering (QA), especially its potential to be a foundation paradigm for tackling diverse NLP tasks.

Data Augmentation Question Answering

Exploring Format Consistency for Instruction Tuning

1 code implementation28 Jul 2023 Shihao Liang, Runchu Tian, Kunlun Zhu, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun

Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions.

Denoising Diversity

Cannot find the paper you are looking for? You can Submit a new open access paper.