Search Results for author: Kejuan Yang

Found 2 papers, 1 papers with code

AgentBench: Evaluating LLMs as Agents

1 code implementation7 Aug 2023 Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

no code implementations24 May 2023 Kejuan Yang, Xiao Liu, Kaiwen Men, Aohan Zeng, Yuxiao Dong, Jie Tang

We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e. g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques.

Long-Context Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.