Search Results for author: Hang Xue

Found 4 papers, 1 papers with code

SuperCLUE-Fin: Graded Fine-Grained Analysis of Chinese LLMs on Diverse Financial Tasks and Applications

no code implementations29 Apr 2024 Liang Xu, Lei Zhu, Yaotong Wu, Hang Xue

The SuperCLUE-Fin (SC-Fin) benchmark is a pioneering evaluation framework tailored for Chinese-native financial large language models (FLMs).

Computational Efficiency Logical Reasoning +1

SuperCLUE-Math6: Graded Multi-Step Math Reasoning Benchmark for LLMs in Chinese

1 code implementation22 Jan 2024 Liang Xu, Hang Xue, Lei Zhu, Kangkang Zhao

We introduce SuperCLUE-Math6(SC-Math6), a new benchmark dataset to evaluate the mathematical reasoning abilities of Chinese language models.

Diversity GSM8K +2

SC-Safety: A Multi-round Open-ended Question Adversarial Safety Benchmark for Large Language Models in Chinese

no code implementations9 Oct 2023 Liang Xu, Kangkang Zhao, Lei Zhu, Hang Xue

To systematically assess the safety of Chinese LLMs, we introduce SuperCLUE-Safety (SC-Safety) - a multi-round adversarial benchmark with 4912 open-ended questions covering more than 20 safety sub-dimensions.

Model Selection Natural Language Understanding

SuperCLUE: A Comprehensive Chinese Large Language Model Benchmark

no code implementations27 Jul 2023 Liang Xu, Anqi Li, Lei Zhu, Hang Xue, Changtai Zhu, Kangkang Zhao, Haonan He, Xuanwei Zhang, Qiyue Kang, Zhenzhong Lan

We fill this gap by proposing a comprehensive Chinese benchmark SuperCLUE, named after another popular Chinese LLM benchmark CLUE.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.