Search Results for author: Yebowen Hu

Found 5 papers, 2 papers with code

Can Large Language Models do Analytical Reasoning?

no code implementations6 Mar 2024 Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

Our analytical reasoning embodies the tasks of letting large language models count how many points each team scores in a quarter in the NBA and NFL games.

Language Modelling Large Language Model

SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs

no code implementations15 Feb 2024 Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

In this paper, we introduce four novel tasks centered around sports data analytics to evaluate the numerical reasoning and information fusion capabilities of LLMs.

InFoBench: Evaluating Instruction Following Ability in Large Language Models

1 code implementation7 Jan 2024 Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, PengFei Liu, Dong Yu

This paper introduces the Decomposed Requirements Following Ratio (DRFR), a new metric for evaluating Large Language Models' (LLMs) ability to follow instructions.

Instruction Following

MeetingBank: A Benchmark Dataset for Meeting Summarization

1 code implementation27 May 2023 Yebowen Hu, Tim Ganter, Hanieh Deilamsalehy, Franck Dernoncourt, Hassan Foroosh, Fei Liu

However, there is a crucial lack of annotated meeting corpora for developing this technology, as it can be hard to collect meetings, especially when the topics discussed are confidential.

Meeting Summarization

DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4

no code implementations24 May 2023 Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Fei Liu

Human preference judgments are pivotal in guiding large language models (LLMs) to produce outputs that align with human values.

Informativeness

Cannot find the paper you are looking for? You can Submit a new open access paper.