Search Results for author: Fangyu Lei

Found 16 papers, 10 papers with code

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

no code implementations12 Nov 2024 Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, Tao Yu

Real-world enterprise text-to-SQL workflows often involve complex cloud or local data across various database systems, multiple SQL queries in various dialects, and diverse operations from data transformation to analytics.

Code Generation Text-To-SQL

DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models

no code implementations9 Oct 2024 Yiming Huang, Jianwen Luo, Yan Yu, Yitong Zhang, Fangyu Lei, Yifan Wei, Shizhu He, Lifu Huang, Xiao Liu, Jun Zhao, Kang Liu

We introduce DA-Code, a code generation benchmark specifically designed to assess LLMs on agent-based data science tasks.

Code Generation

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

1 code implementation15 Jul 2024 Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

These tasks, derived from real-world use cases, evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.

Code Generation

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

1 code implementation11 Apr 2024 Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity.

Benchmarking

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

1 code implementation21 Feb 2024 Xiaoyan Yu, Tongxu Luo, Yifan Wei, Fangyu Lei, Yiming Huang, Hao Peng, Liehuang Zhu

Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios.

Incremental Learning

Competition-Level Problems are Effective LLM Evaluators

no code implementations4 Dec 2023 Yiming Huang, Zhenghao Lin, Xiao Liu, Yeyun Gong, Shuai Lu, Fangyu Lei, Yaobo Liang, Yelong Shen, Chen Lin, Nan Duan, Weizhu Chen

Large language models (LLMs) have demonstrated impressive reasoning capabilities, yet there is ongoing debate about these abilities and the potential data contamination problem recently.

Assessing Knowledge Editing in Language Models via Relation Perspective

2 code implementations15 Nov 2023 Yifan Wei, Xiaoyan Yu, Huanhuan Ma, Fangyu Lei, Yixuan Weng, Ran Song, Kang Liu

Knowledge Editing (KE) for modifying factual knowledge in Large Language Models (LLMs) has been receiving increasing attention.

knowledge editing Relation

TableQAKit: A Comprehensive and Practical Toolkit for Table-based Question Answering

no code implementations23 Oct 2023 Fangyu Lei, Tongxu Luo, Pengqi Yang, Weihao Liu, Hanwen Liu, Jiahe Lei, Yiming Huang, Yifan Wei, Shizhu He, Jun Zhao, Kang Liu

Table-based question answering (TableQA) is an important task in natural language processing, which requires comprehending tables and employing various reasoning ways to answer the questions.

Table-based Question Answering

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

2 code implementations23 Oct 2023 Fangyu Lei, Qian Liu, Yiming Huang, Shizhu He, Jun Zhao, Kang Liu

The rapid development of Large Language Models (LLMs) has led to great strides in model capabilities like long-context understanding and reasoning.

Long-Context Understanding

MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models

1 code implementation8 Oct 2023 Yifan Wei, Yisong Su, Huanhuan Ma, Xiaoyan Yu, Fangyu Lei, Yuanzhe Zhang, Jun Zhao, Kang Liu

As a result, it is natural for people to believe that LLMs have also mastered abilities such as time understanding and reasoning.

counterfactual

MMHQA-ICL: Multimodal In-context Learning for Hybrid Question Answering over Text, Tables and Images

no code implementations9 Sep 2023 Weihao Liu, Fangyu Lei, Tongxu Luo, Jiahe Lei, Shizhu He, Jun Zhao, Kang Liu

Most importantly, we propose a Type-specific In-context Learning Strategy for MMHQA, enabling LLMs to leverage their powerful performance in this task.

In-Context Learning Question Answering +1

S$^3$HQA: A Three-Stage Approach for Multi-hop Text-Table Hybrid Question Answering

1 code implementation19 May 2023 Fangyu Lei, Xiang Li, Yifan Wei, Shizhu He, Yiming Huang, Jun Zhao, Kang Liu

In this paper, we propose a three-stage TextTableQA framework S3HQA, which comprises of retriever, selector, and reasoner.

Question Answering Reading Comprehension

Multi-View Graph Representation Learning for Answering Hybrid Numerical Reasoning Question

1 code implementation5 May 2023 Yifan Wei, Fangyu Lei, Yuanzhe Zhang, Jun Zhao, Kang Liu

Hybrid question answering (HybridQA) over the financial report contains both textual and tabular data, and requires the model to select the appropriate evidence for the numerical reasoning task.

Decoder Graph Representation Learning +2

Answering Numerical Reasoning Questions in Table-Text Hybrid Contents with Graph-based Encoder and Tree-based Decoder

1 code implementation COLING 2022 Fangyu Lei, Shizhu He, Xiang Li, Jun Zhao, Kang Liu

In the real-world question answering scenarios, hybrid form combining both tabular and textual contents has attracted more and more attention, among which numerical reasoning problem is one of the most typical and challenging problems.

Decoder Models Alignment +1

Cannot find the paper you are looking for? You can Submit a new open access paper.