Search Results for author: Jiayi Shi

Found 1 papers, 0 papers with code

Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

no code implementations8 Mar 2024 Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng

The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task.

Benchmarking Decision Making +2

Cannot find the paper you are looking for? You can Submit a new open access paper.