Search Results for author: Yusen Zhang

Found 23 papers, 17 papers with code

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

1 code implementation12 Dec 2024 Sarkar Snigdha Sarathi Das, Ryo Kamoi, Bo Pang, Yusen Zhang, Caiming Xiong, Rui Zhang

The effectiveness of large language models (LLMs) is closely tied to the design of prompts, making prompt optimization essential for enhancing their performance across a wide range of tasks.

GSM8K Prompt Engineering

Coverage-based Fairness in Multi-document Summarization

no code implementations11 Dec 2024 Haoyuan Li, Yusen Zhang, Rui Zhang, Snigdha Chaturvedi

In this work, we propose a new summary-level fairness measure, Equal Coverage, which is based on coverage of documents with different social attribute values and considers the redundancy within documents.

Attribute Document Summarization +2

VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information

1 code implementation1 Dec 2024 Ryo Kamoi, Yusen Zhang, Sarkar Snigdha Sarathi Das, Ranran Haoran Zhang, Rui Zhang

In this work, we introduce VisOnlyQA, a new dataset designed to directly evaluate the visual perception capabilities of LVLMs on questions about geometric and numerical information in scientific figures.

Multiple-choice

Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models

1 code implementation12 Nov 2024 Yusen Zhang, Sarkar Snigdha Sarathi Das, Rui Zhang

Both 1) and 2) highlight the urgent need to mitigate the frequency of VC behavior and disentangle verbosity with veracity.

Hallucination

AAAR-1.0: Assessing AI's Potential to Assist Research

no code implementations29 Oct 2024 Renze Lou, Hanzi Xu, Sijia Wang, Jiangshu Du, Ryo Kamoi, Xiaoxin Lu, Jian Xie, Yuxuan Sun, Yusen Zhang, Jihyun Janice Ahn, Hongchao Fang, Zhuoyang Zou, Wenchao Ma, Xi Li, Kai Zhang, Congying Xia, Lifu Huang, Wenpeng Yin

Numerous studies have assessed the proficiency of AI systems, particularly large language models (LLMs), in facilitating everyday tasks such as email writing, question answering, and creative content generation.

Question Answering

Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models

no code implementations10 Jun 2024 Xi Li, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang

Large Language Models (LLMs), especially those accessed via APIs, have demonstrated impressive capabilities across various domains.

Prompt Engineering

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

no code implementations4 Jun 2024 Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik

Addressing the challenge of effectively processing long contexts has become a critical issue for Large Language Models (LLMs).

Code Completion Question Answering +1

When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs

no code implementations3 Jun 2024 Ryo Kamoi, Yusen Zhang, Nan Zhang, Jiawei Han, Rui Zhang

Our critical survey based on the newly categorized research questions shows that (1) no prior work demonstrates successful self-correction with feedback from prompted LLMs, except for studies in tasks that are exceptionally suited for self-correction, (2) self-correction works well in tasks that can use reliable external feedback, and (3) large-scale fine-tuning enables self-correction.

Survey

A General Benchmark Framework is Dynamic Graph Neural Network Need

no code implementations12 Jan 2024 Yusen Zhang

In conclusion, this paper identifies the lack of a standardized benchmark framework as a current limitation in dynamic graph learning research .

Graph Learning Graph Neural Network

Fair Abstractive Summarization of Diverse Perspectives

1 code implementation14 Nov 2023 Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang

However, current work in summarization metrics and Large Language Models (LLMs) evaluation has not explored fair abstractive summarization.

Abstractive Text Summarization Fairness

FaMeSumm: Investigating and Improving Faithfulness of Medical Summarization

1 code implementation3 Nov 2023 Nan Zhang, Yusen Zhang, Wu Guo, Prasenjit Mitra, Rui Zhang

In this paper, we investigate and improve faithfulness in summarization on a broad range of medical summarization tasks.

Contrastive Learning

XSemPLR: Cross-Lingual Semantic Parsing in Multiple Natural Languages and Meaning Representations

1 code implementation7 Jun 2023 Yusen Zhang, Jun Wang, Zhiguo Wang, Rui Zhang

However, existing CLSP models are separately proposed and evaluated on datasets of limited tasks and applications, impeding a comprehensive and unified evaluation of CLSP on a diverse range of NLs and MRs. To this end, we present XSemPLR, a unified benchmark for cross-lingual semantic parsing featured with 22 natural languages and 8 meaning representations by examining and selecting 9 existing datasets to cover 5 tasks and 164 domains.

Cross-Lingual Transfer Decoder +3

MACSum: Controllable Summarization with Mixed Attributes

1 code implementation9 Nov 2022 Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang

We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.

Attribute Specificity

AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications

1 code implementation COLING 2022 Yusen Zhang, Zhongli Li, Qingyu Zhou, Ziyi Liu, Chao Li, Mina Ma, Yunbo Cao, Hongzhi Liu

To automatically correct handwritten assignments, the traditional approach is to use an OCR model to recognize characters and compare them to answers.

Optical Character Recognition (OCR)

An Exploratory Study on Long Dialogue Summarization: What Works and What's Next

1 code implementation10 Sep 2021 Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev

Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series.

Retrieval

SummerTime: Text Summarization Toolkit for Non-experts

1 code implementation EMNLP (ACL) 2021 Ansong Ni, Zhangir Azerbayev, Mutethia Mutuma, Troy Feng, Yusen Zhang, Tao Yu, Ahmed Hassan Awadallah, Dragomir Radev

We also provide explanations for models and evaluation metrics to help users understand the model behaviors and select models that best suit their needs.

Document Summarization Multi-Document Summarization

Logic-Consistency Text Generation from Semantic Parses

1 code implementation Findings (ACL) 2021 Chang Shu, Yusen Zhang, Xiangyu Dong, Peng Shi, Tao Yu, Rui Zhang

Text generation from semantic parses is to generate textual descriptions for formal representation inputs such as logic forms and SQL queries.

Text Generation

Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL

1 code implementation23 Oct 2020 Yusen Zhang, Xiangyu Dong, Shuaichen Chang, Tao Yu, Peng Shi, Rui Zhang

Neural models have achieved significant results on the text-to-SQL task, in which most current work assumes all the input questions are legal and generates a SQL query for any input.

Text-To-SQL

Cannot find the paper you are looking for? You can Submit a new open access paper.