Search Results for author: Zhanyue Qin

Found 4 papers, 0 papers with code

TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs

no code implementations14 Oct 2024 Haochuan Wang, Xiachong Feng, Lei LI, Zhanyue Qin, Dianbo Sui, Lingpeng Kong

The rapid advancement of large language models (LLMs) has accelerated their application in reasoning, with strategic reasoning drawing increasing attention.

Synthetic Data Generation

Mitigating Gender Bias in Code Large Language Models via Model Editing

no code implementations10 Oct 2024 Zhanyue Qin, Haochuan Wang, Zecheng Wang, Deyuan Liu, Cunhang Fan, Zhao Lv, Zhiying Tu, Dianhui Chu, Dianbo Sui

At the same time, the experimental results show that, considering both the gender bias of the model and its general code generation capability, MG-Editing is most effective when applied at the row and neuron levels of granularity.

Code Generation knowledge editing +2

UNO Arena for Evaluating Sequential Decision-Making Capability of Large Language Models

no code implementations24 Jun 2024 Zhanyue Qin, Haochuan Wang, Deyuan Liu, Ziyang Song, Cunhang Fan, Zhao Lv, Jinlin Wu, Zhen Lei, Zhiying Tu, Dianhui Chu, Xiaoyan Yu, Dianbo Sui

In order to answer this question, we propose the UNO Arena based on the card game UNO to evaluate the sequential decision-making capability of LLMs and explain in detail why we choose UNO.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.