Search Results for author: Jiayi Shi

Found 1 papers, 0 papers with code

Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents

no code implementations • 8 Mar 2024 • Jinyang Li, Nan Huo, Yan Gao, Jiayi Shi, Yingxiu Zhao, Ge Qu, Yurong Wu, Chenhao Ma, Jian-Guang Lou, Reynold Cheng

The challenges and costs of collecting realistic interactive logs for data analysis hinder the quantitative evaluation of Large Language Model (LLM) agents in this task.

Benchmarking Decision Making +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.