no code implementations • 30 Oct 2024 • Jia Li, Ge Li, Xuanming Zhang, YunFei Zhao, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li
These evaluations help practitioners select superior LLMs in specific domains and discover the shortcomings of existing LLMs.
1 code implementation • 9 Oct 2024 • Xuanming Zhang, Yuxuan Chen, Yuan Yuan, Minlie Huang
In real world software development, improper or missing exception handling can severely impact the robustness and reliability of code.
1 code implementation • 28 Jun 2024 • Xuanming Zhang, Anthony Diaz, Zixun Chen, Qingyang Wu, Kun Qian, Erik Voss, Zhou Yu
To bridge this gap, we introduce DECOR, a novel benchmark that includes expert annotations for detecting incoherence in L2 English writing, identifying the underlying reasons, and rewriting the incoherent sentences.
1 code implementation • 25 Jun 2024 • Kun Qian, Shunji Wan, Claudia Tang, Youzhi Wang, Xuanming Zhang, Maximillian Chen, Zhou Yu
As large language models achieve impressive scores on traditional benchmarks, an increasing number of researchers are becoming concerned about benchmark data leakage during pre-training, commonly known as the data contamination problem.
1 code implementation • 30 May 2024 • Jia Li, Ge Li, YunFei Zhao, Yongmin Li, Huanyu Liu, Hao Zhu, Lecheng Wang, Kaibo Liu, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yuqi Zhu, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li
Our experiments reveal these LLMs' coding abilities in real-world code repositories.
1 code implementation • 31 Mar 2024 • Jia Li, Ge Li, Xuanming Zhang, Yihong Dong, Zhi Jin
Existing benchmarks demonstrate poor alignment with real-world code repositories and are insufficient to evaluate the coding abilities of LLMs.
1 code implementation • 21 Jan 2024 • Xuanming Zhang, Zixun Chen, Zhou Yu
To bridge this gap, we propose a new task, language proficiency-oriented lexical substitution.
no code implementations • 12 Jan 2024 • Jia Li, Ge Li, YunFei Zhao, Yongmin Li, Zhi Jin, Hao Zhu, Huanyu Liu, Kaibo Liu, Lecheng Wang, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yihong Dong, Yuqi Zhu, Bin Gu, Mengfei Yang
Compared to previous benchmarks, DevEval aligns to practical projects in multiple dimensions, e. g., real program distributions, sufficient dependencies, and enough-scale project contexts.
no code implementations • 3 Sep 2023 • Xuanming Zhang, Xiaoxue Wang, Yonghang Chen
Penalized regression models were then applied for their advantages in overfitting control, high-dimensional data processing, and feature selection - well-suited for the complex energy data.
no code implementations • 12 Apr 2021 • Yihan Pan, Zhenghang Xu, Jin Guang, Jingjing Sun, Chengwenjian Wang, Xuanming Zhang, Xinyun Chen, J. G. Dai, Yichuan Ding, Pengyi Shi, Hongxin Pan, Kai Yang, Song Wu
To address the issue, we propose a novel two-level routing component to the queueing network model.
no code implementations • LREC 2020 • Gerardo Ocampo Diaz, Xuanming Zhang, Vincent Ng
We show how the general fine-grained opinion mining concepts of opinion target and opinion expression are related to aspect-based sentiment analysis (ABSA) and discuss their benefits for resource creation over popular ABSA annotation schemes.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1