Search Results for author: Jie M. Zhang

Found 18 papers, 9 papers with code

Diversity Drives Fairness: Ensemble of Higher Order Mutants for Intersectional Fairness of Machine Learning Software

no code implementations11 Dec 2024 Zhenpeng Chen, Xinyue Li, Jie M. Zhang, Federica Sarro, Yang Liu

Intersectional fairness is a critical requirement for Machine Learning (ML) software, demanding fairness across subgroups defined by multiple protected attributes.

Decision Making Diversity +1

Benchmarking Bias in Large Language Models during Role-Playing

no code implementations1 Nov 2024 Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Yiling Lou, Tianlin Li, Weisong Sun, Yang Liu, Xuanzhe Liu

Our benchmark reveals 72, 716 biased responses across the studied LLMs, with individual models yielding between 7, 754 and 16, 963 biased responses, underscoring the prevalence of bias in role-playing contexts.

Benchmarking Fairness +1

Personality-Guided Code Generation Using Large Language Models

no code implementations16 Oct 2024 Yaoqi Guo, Zhenpeng Chen, Jie M. Zhang, Yang Liu, Yun Ma

Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development.

Code Generation Personality Alignment

Using Protected Attributes to Consider Fairness in Multi-Agent Systems

no code implementations16 Oct 2024 Gabriele La Malfa, Jie M. Zhang, Michael Luck, Elizabeth Black

Fairness in Multi-Agent Systems (MAS) has been extensively studied, particularly in reward distribution among agents in scenarios such as goods allocation, resource division, lotteries, and bargaining systems.

counterfactual Decision Making +2

Effi-Code: Unleashing Code Efficiency in Language Models

1 code implementation14 Oct 2024 Dong Huang, Guangtao Zeng, Jianbo Dai, Meng Luo, Han Weng, Yuhao QING, Heming Cui, Zhijiang Guo, Jie M. Zhang

In this work, we present Effi-Code, an approach to enhancing code generation in LLMs that can improve both efficiency and correctness.

Code Generation

Rethinking the Influence of Source Code on Test Case Generation

1 code implementation14 Sep 2024 Dong Huang, Jie M. Zhang, Mingzhe Du, Mark Harman, Heming Cui

Large language models (LLMs) have been widely applied to assist test generation with the source code under test provided as the context.

HumanEval

EffiLearner: Enhancing Efficiency of Generated Code via Self-Optimization

3 code implementations24 May 2024 Dong Huang, Jianbo Dai, Han Weng, Puzhen Wu, Yuhao QING, Heming Cui, Zhijiang Guo, Jie M. Zhang

To address this issue, we propose \textbf{EffiLearner}, a self-optimization framework that utilizes execution overhead profiles to improve the efficiency of LLM-generated code.

Code Generation HumanEval

LLM-Powered Test Case Generation for Detecting Tricky Bugs

no code implementations16 Apr 2024 Kaibo Liu, Yiyang Liu, Zhenpeng Chen, Jie M. Zhang, Yudong Han, Yun Ma, Ge Li, Gang Huang

Conventional automated test generation tools struggle to generate test oracles and tricky bug-revealing test inputs.

EffiBench: Benchmarking the Efficiency of Automatically Generated Code

1 code implementation3 Feb 2024 Dong Huang, Yuhao QING, Weiyi Shang, Heming Cui, Jie M. Zhang

This paper presents EffiBench, a benchmark with 1, 000 efficiency-critical coding problems to assess the efficiency of code generated by code generation models.

Benchmarking Code Completion +1

AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

1 code implementation20 Dec 2023 Dong Huang, Jie M. Zhang, Michael Luck, Qingwen Bu, Yuhao QING, Heming Cui

The advancement of natural language processing (NLP) has been significantly boosted by the development of transformer-based large language models (LLMs).

Code Generation HumanEval +1

ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair

no code implementations25 Oct 2023 Yonghao Wu, Zheng Li, Jie M. Zhang, Yong liu

With the growing interest on Large Language Models (LLMs) for fault localization and program repair, ensuring the integrity and generalizability of the LLM-based methods becomes paramount.

Benchmarking Fault localization

Bias Behind the Wheel: Fairness Testing of Autonomous Driving Systems

2 code implementations5 Aug 2023 Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Ying Zhang, Xuanzhe Liu

This paper conducts fairness testing of automated pedestrian detection, a crucial but under-explored issue in autonomous driving systems.

Autonomous Driving Fairness +1

Fairness Improvement with Multiple Protected Attributes: How Far Are We?

1 code implementation25 Jul 2023 Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman

Existing research mostly improves the fairness of Machine Learning (ML) software regarding a single protected attribute at a time, but this is unrealistic given that many users have multiple protected attributes.

Attribute Fairness

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

2 code implementations7 Jul 2022 Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman

We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%~66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%~59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best trade-off in all the scenarios.

Fairness

Model Validation Using Mutated Training Labels: An Exploratory Study

no code implementations24 May 2019 Jie M. Zhang, Mark Harman, Benjamin Guedj, Earl T. Barr, John Shawe-Taylor

MV mutates training data labels, retrains the model against the mutated data, then uses the metamorphic relation that captures the consequent training performance changes to assess model fit.

BIG-bench Machine Learning General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.