Search Results for author: Zhenpeng Chen

Found 22 papers, 12 papers with code

Can Agents Fix Agent Issues?

no code implementations27 May 2025 Alfin Wijaya Rahardja, Junwei Liu, Weitong Chen, Zhenpeng Chen, Yiling Lou

These results underscore the unique challenges of maintaining agent systems compared to traditional software, highlighting the need for further research to develop advanced SE agents for resolving agent issues.

AMQA: An Adversarial Dataset for Benchmarking Bias of LLMs in Medicine and Healthcare

1 code implementation26 May 2025 Ying Xiao, Jie Huang, Ruijuan He, Jing Xiao, Mohammad Reza Mousavi, Yepang Liu, Kezhi Li, Zhenpeng Chen, Jie M. Zhang

Large language models (LLMs) are reaching expert-level accuracy on medical diagnosis questions, yet their mistakes and the biases behind them pose life-critical risks.

Benchmarking Medical Diagnosis +1

Diversity Drives Fairness: Ensemble of Higher Order Mutants for Intersectional Fairness of Machine Learning Software

no code implementations11 Dec 2024 Zhenpeng Chen, Xinyue Li, Jie M. Zhang, Federica Sarro, Yang Liu

Intersectional fairness is a critical requirement for Machine Learning (ML) software, demanding fairness across subgroups defined by multiple protected attributes.

Decision Making Diversity +1

Benchmarking Bias in Large Language Models during Role-Playing

no code implementations1 Nov 2024 Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Yiling Lou, Tianlin Li, Weisong Sun, Yang Liu, Xuanzhe Liu

Our benchmark reveals 72, 716 biased responses across the studied LLMs, with individual models yielding between 7, 754 and 16, 963 biased responses, underscoring the prevalence of bias in role-playing contexts.

Benchmarking Fairness +1

Personality-Guided Code Generation Using Large Language Models

1 code implementation16 Oct 2024 Yaoqi Guo, Zhenpeng Chen, Jie M. Zhang, Yang Liu, Yun Ma

Code generation, the automatic creation of source code from natural language descriptions, has garnered significant attention due to its potential to streamline software development.

Code Generation Personality Alignment

Large Language Model-Based Agents for Software Engineering: A Survey

1 code implementation4 Sep 2024 Junwei Liu, Kaixin Wang, Yixuan Chen, Xin Peng, Zhenpeng Chen, Lingming Zhang, Yiling Lou

The recent advance in Large Language Models (LLMs) has shaped a new paradigm of AI agents, i. e., LLM-based agents.

AI Agent Language Modeling +2

LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs

1 code implementation16 Apr 2024 Kaibo Liu, Zhenpeng Chen, Yiyang Liu, Jie M. Zhang, Mark Harman, Yudong Han, Yun Ma, Yihong Dong, Ge Li, Gang Huang

To address this problem, we propose TrickCatcher, an LLM-powered approach to generating test cases for uncovering bugs in plausible programs.

software testing

A Prompt Learning Framework for Source Code Summarization

1 code implementation26 Dec 2023 Tingting Xu, Yun Miao, Chunrong Fang, Hanwei Qian, Xia Feng, Zhenpeng Chen, Chong Wang, Jian Zhang, Weisong Sun, Zhenyu Chen, Yang Liu

Our comprehensive experimental results show that PromptCS significantly outperforms instruction prompting schemes (including zero-shot learning and few-shot learning) on all four widely used metrics, and is comparable to the task-oriented fine-tuning scheme.

Code Summarization Few-Shot Learning +4

Bias Behind the Wheel: Fairness Testing of Autonomous Driving Systems

2 code implementations5 Aug 2023 Xinyue Li, Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Ying Zhang, Xuanzhe Liu

This paper conducts fairness testing of automated pedestrian detection, a crucial but under-explored issue in autonomous driving systems.

Autonomous Driving Fairness +1

Fairness Improvement with Multiple Protected Attributes: How Far Are We?

1 code implementation25 Jul 2023 Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman

Existing research mostly improves the fairness of Machine Learning (ML) software regarding a single protected attribute at a time, but this is unrealistic given that many users have multiple protected attributes.

Attribute Fairness

A Comprehensive Empirical Study of Bias Mitigation Methods for Machine Learning Classifiers

2 code implementations7 Jul 2022 Zhenpeng Chen, Jie M. Zhang, Federica Sarro, Mark Harman

We find that (1) the bias mitigation methods significantly decrease ML performance in 53% of the studied scenarios (ranging between 42%~66% according to different ML performance metrics); (2) the bias mitigation methods significantly improve fairness measured by the 4 used metrics in 46% of all the scenarios (ranging between 24%~59% according to different fairness metrics); (3) the bias mitigation methods even lead to decrease in both fairness and ML performance in 25% of the scenarios; (4) the effectiveness of the bias mitigation methods depends on tasks, models, the choice of protected attributes, and the set of metrics used to assess fairness and ML performance; (5) there is no bias mitigation method that can achieve the best trade-off in all the scenarios.

Fairness

Learning point embedding for 3D data processing

no code implementations19 Jul 2021 Zhenpeng Chen, Yuan Li

Among 2D convolutional networks on point clouds, point-based approaches consume point clouds of fixed size directly.

Emojis predict dropouts of remote workers: An empirical study of emoji usage on GitHub

no code implementations10 Feb 2021 Xuan Lu, Wei Ai, Zhenpeng Chen, Yanbin Cao, Qiaozhu Mei

This paper studies how emojis, as non-verbal cues in online communications, can be used for such purposes and how the emotional signals in emoji usage can be used to predict future behavior of workers.

Management

An Empirical Study on Deployment Faults of Deep Learning Based Mobile Applications

1 code implementation13 Jan 2021 Zhenpeng Chen, Huihan Yao, Yiling Lou, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, Xuanzhe Liu

In contrast, faults related to the deployment of DL models on mobile devices (named as deployment faults of mobile DL apps) have not been well studied.

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

no code implementations12 Jun 2020 Chengxu Yang, Qipeng Wang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, Xuanzhe Liu

Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings.

Fairness Federated Learning +1

Understanding Challenges in Deploying Deep Learning Based Software: An Empirical Study

no code implementations2 May 2020 Zhenpeng Chen, Yanbin Cao, Yuanqiang Liu, Haoyu Wang, Tao Xie, Xuanzhe Liu

Deep learning (DL) becomes increasingly pervasive, being used in a wide range of software applications.

Software Engineering

SEntiMoji: An Emoji-Powered Learning Approach for Sentiment Analysis in Software Engineering

1 code implementation4 Jul 2019 Zhenpeng Chen, Yanbin Cao, Xuan Lu, Qiaozhu Mei, Xuanzhe Liu

However, commonly used out-of-the-box sentiment analysis tools cannot obtain reliable results on SE tasks and the misunderstanding of technical jargon is demonstrated to be the main reason.

Representation Learning Sentiment Analysis

A First Look at Emoji Usage on GitHub: An Empirical Study

1 code implementation12 Dec 2018 Xuan Lu, Yanbin Cao, Zhenpeng Chen, Xuanzhe Liu

We find that emojis are used by a considerable proportion of GitHub users.

Computers and Society Software Engineering

Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

1 code implementation7 Jun 2018 Zhenpeng Chen, Sheng Shen, Ziniu Hu, Xuan Lu, Qiaozhu Mei, Xuanzhe Liu

To tackle this problem, cross-lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled examples (i. e., the source language, usually English) to another language with fewer labels (i. e., the target language).

Classification Cross-Lingual Sentiment Classification +5

Cannot find the paper you are looking for? You can Submit a new open access paper.