no code implementations • 29 Apr 2024 • Liang Xu, Lei Zhu, Yaotong Wu, Hang Xue
The SuperCLUE-Fin (SC-Fin) benchmark is a pioneering evaluation framework tailored for Chinese-native financial large language models (FLMs).
1 code implementation • 22 Jan 2024 • Liang Xu, Hang Xue, Lei Zhu, Kangkang Zhao
We introduce SuperCLUE-Math6(SC-Math6), a new benchmark dataset to evaluate the mathematical reasoning abilities of Chinese language models.
no code implementations • 9 Oct 2023 • Liang Xu, Kangkang Zhao, Lei Zhu, Hang Xue
To systematically assess the safety of Chinese LLMs, we introduce SuperCLUE-Safety (SC-Safety) - a multi-round adversarial benchmark with 4912 open-ended questions covering more than 20 safety sub-dimensions.
no code implementations • 27 Jul 2023 • Liang Xu, Anqi Li, Lei Zhu, Hang Xue, Changtai Zhu, Kangkang Zhao, Haonan He, Xuanwei Zhang, Qiyue Kang, Zhenzhong Lan
We fill this gap by proposing a comprehensive Chinese benchmark SuperCLUE, named after another popular Chinese LLM benchmark CLUE.