no code implementations • 11 Dec 2024 • Zhenpeng Chen, Xinyue Li, Jie M. Zhang, Federica Sarro, Yang Liu
Intersectional fairness is a critical requirement for Machine Learning (ML) software, demanding fairness across subgroups defined by multiple protected attributes.
no code implementations • 1 Nov 2024 • Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Yiling Lou, Tianlin Li, Weisong Sun, Yang Liu, Xuanzhe Liu
Our benchmark reveals 72, 716 biased responses across the studied LLMs, with individual models yielding between 7, 754 and 16, 963 biased responses, underscoring the prevalence of bias in role-playing contexts.
no code implementations • 16 Oct 2024 • Yaoqi Guo, Zhenpeng Chen, Jie M. Zhang, Yang Liu, Yun Ma
Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development.
no code implementations • 16 Oct 2024 • Gabriele La Malfa, Jie M. Zhang, Michael Luck, Elizabeth Black
Fairness in Multi-Agent Systems (MAS) has been extensively studied, particularly in reward distribution among agents in scenarios such as goods allocation, resource division, lotteries, and bargaining systems.
1 code implementation • 14 Oct 2024 • Dong Huang, Guangtao Zeng, Jianbo Dai, Meng Luo, Han Weng, Yuhao QING, Heming Cui, Zhijiang Guo, Jie M. Zhang
In this work, we present Effi-Code, an approach to enhancing code generation in LLMs that can improve both efficiency and correctness.
1 code implementation • 14 Sep 2024 • Dong Huang, Jie M. Zhang, Mingzhe Du, Mark Harman, Heming Cui
Large language models (LLMs) have been widely applied to assist test generation with the source code under test provided as the context.
3 code implementations • 24 May 2024 • Dong Huang, Jianbo Dai, Han Weng, Puzhen Wu, Yuhao QING, Heming Cui, Zhijiang Guo, Jie M. Zhang
To address this issue, we propose \textbf{EffiLearner}, a self-optimization framework that utilizes execution overhead profiles to improve the efficiency of LLM-generated code.
no code implementations • 16 Apr 2024 • Kaibo Liu, Yiyang Liu, Zhenpeng Chen, Jie M. Zhang, Yudong Han, Yun Ma, Ge Li, Gang Huang
Conventional automated test generation tools struggle to generate test oracles and tricky bug-revealing test inputs.
1 code implementation • 3 Feb 2024 • Dong Huang, Yuhao QING, Weiyi Shang, Heming Cui, Jie M. Zhang
This paper presents EffiBench, a benchmark with 1, 000 efficiency-critical coding problems to assess the efficiency of code generated by code generation models.
1 code implementation • 20 Dec 2023 • Dong Huang, Jie M. Zhang, Michael Luck, Qingwen Bu, Yuhao QING, Heming Cui
The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs).
Ranked #2 on Code Generation on MBPP
no code implementations • 25 Oct 2023 • Yonghao Wu, Zheng Li, Jie M. Zhang, Yong liu
With the growing interest on Large Language Models (LLMs) for fault localization and program repair, ensuring the integrity and generalizability of the LLM-based methods becomes paramount.
2 code implementations • 5 Aug 2023 • Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Ying Zhang, Xuanzhe Liu
This paper conducts fairness testing of automated pedestrian detection, a crucial but under-explored issue in autonomous driving systems.
1 code implementation • 25 Jul 2023 • Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman
Existing research mostly improves the fairness of Machine Learning (ML) software regarding a single protected attribute at a time, but this is unrealistic given that many users have multiple protected attributes.
no code implementations • 14 Jul 2022 • Max Hort, Zhenpeng Chen, Jie M. Zhang, Mark Harman, Federica Sarro
How many datasets are used for evaluating bias mitigation methods?
2 code implementations • 7 Jul 2022 • Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman
We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%~66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%~59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best trade-off in all the scenarios.
1 code implementation • ICLR 2022 • Baptiste Roziere, Jie M. Zhang, Francois Charton, Mark Harman, Gabriel Synnaeve, Guillaume Lample
With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.
no code implementations • 19 Jun 2019 • Jie M. Zhang, Mark Harman, Lei Ma, Yang Liu
This paper provides a comprehensive survey of Machine Learning Testing (ML testing) research.
no code implementations • 24 May 2019 • Jie M. Zhang, Mark Harman, Benjamin Guedj, Earl T. Barr, John Shawe-Taylor
MV mutates training data labels, retrains the model against the mutated data, then uses the metamorphic relation that captures the consequent training performance changes to assess model fit.