no code implementations • 31 Jul 2024 • Oscar Sainz, Iker García-Ferrero, Alon Jacovi, Jon Ander Campos, Yanai Elazar, Eneko Agirre, Yoav Goldberg, Wei-Lin Chen, Jenny Chim, Leshem Choshen, Luca D'Amico-Wong, Melissa Dell, Run-Ze Fan, Shahriar Golchin, Yucheng Li, PengFei Liu, Bhavish Pahwa, Ameya Prabhu, Suryansh Sharma, Emily Silcock, Kateryna Solonko, David Stap, Mihai Surdeanu, Yu-Min Tseng, Vishaal Udandarao, Zengzhi Wang, Ruijie Xu, Jinglin Yang
The workshop fostered a shared task to collect evidence on data contamination in current available datasets and models.
1 code implementation • 24 Jun 2024 • Zhen Huang, Zengzhi Wang, Shijie Xia, PengFei Liu
In this report, we pose the following question: Who is the most intelligent AI model to date, as measured by the OlympicArena (an Olympic-level, multi-discipline, multi-modal benchmark for superintelligent AI)?
1 code implementation • 18 Jun 2024 • Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang, Dahua Lin, Yu Qiao, PengFei Liu
We delve into the models' cognitive reasoning abilities, their performance across different modalities, and their outcomes in process-level evaluations, which are vital for tasks requiring complex reasoning with lengthy solutions.
1 code implementation • 29 Apr 2024 • Ruijie Xu, Zengzhi Wang, Run-Ze Fan, PengFei Liu
By analyzing 31 LLMs under the context of mathematical reasoning, we reveal substantial instances of training even test set misuse, resulting in potentially unfair comparisons.
1 code implementation • 28 Dec 2023 • Zengzhi Wang, Rui Xia, PengFei Liu
Our meticulous data collection and processing efforts included a complex suite of preprocessing, prefiltering, language identification, cleaning, filtering, and deduplication, ensuring the high quality of our corpus.
2 code implementations • 3 Oct 2023 • Qiming Xie, Zengzhi Wang, Yi Feng, Rui Xia
We observe that current conversational language models often waver in their judgments when faced with follow-up questions, even if the original judgment was correct.
1 code implementation • 29 Jun 2023 • Hongjie Cai, Nan Song, Zengzhi Wang, Qiming Xie, Qiankun Zhao, Ke Li, Siwei Wu, Shijie Liu, Jianfei Yu, Rui Xia
Aspect-based sentiment analysis is a long-standing research interest in the field of opinion mining, and in recent years, researchers have gradually shifted their focus from simple ABSA subtasks to end-to-end multi-element ABSA tasks.
1 code implementation • 10 Apr 2023 • Zengzhi Wang, Qiming Xie, Yi Feng, Zixiang Ding, Zinong Yang, Rui Xia
Recently, ChatGPT has drawn great attention from both the research community and the public.
no code implementations • 20 Nov 2022 • Zengzhi Wang, Rui Xia, Jianfei Yu
Aspect-Based Sentiment Analysis (ABSA) aims to provide fine-grained aspect-level sentiment information.
Ranked #5 on Aspect-Based Sentiment Analysis (ABSA) on ACOS (using extra training data)
Aspect-Based Sentiment Analysis Aspect-Category-Opinion-Sentiment Quadruple Extraction +5