Search Results for author: David Lo

Found 47 papers, 20 papers with code

Watch out for This Commit! A Study of Influential Software Changes

no code implementations10 Jun 2016 Daoyuan Li, Li Li, Dongsun Kim, Tegawendé F. Bissyandé, David Lo, Yves Le Traon

One single code change can significantly influence a wide range of software systems and their users.

Software Engineering

Collective Semi-Supervised Learning for User Profiling in Social Media

no code implementations24 Jun 2016 Richard J. Oentaryo, Ee-Peng Lim, Freddy Chong Tat Chua, Jia-Wei Low, David Lo

The abundance of user-generated data in social media has incentivized the development of methods to infer the latent attributes of users, which are crucially useful for personalization, advertising and recommendation.

WebAPIRec: Recommending Web APIs to Software Projects via Personalized Ranking

no code implementations1 May 2017 Ferdian Thung, Richard J. Oentaryo, David Lo, Yuan Tian

In this light, we propose a new, automated approach called WebAPIRec that takes as input a project profile and outputs a ranked list of {web} APIs that can be used to implement the project.

Network-Clustered Multi-Modal Bug Localization

no code implementations27 Feb 2018 Thong Hoang, Richard J. Oentaryo, Tien-Duy B. Le, David Lo

To help the developers debug, numerous information retrieval (IR)-based and spectrum-based bug localization techniques have been devised.

Clustering Information Retrieval +1

PatchNet: A Tool for Deep Patch Classification

1 code implementation16 Feb 2019 Thong Hoang, Julia Lawall, Richard J. Oentaryo, Yuan Tian, David Lo

This work proposes PatchNet, an automated tool based on hierarchical deep learning for classifying patches by extracting features from commit messages and code changes.

Classification General Classification

Question Relatedness on Stack Overflow: The Task, Dataset, and Corpus-inspired Models

no code implementations3 May 2019 Amirreza Shirani, Bowen Xu, David Lo, Thamar Solorio, Amin Alipour

The proposed dataset Stack Overflow is a useful resource to develop novel solutions, specifically data-hungry neural network models, for the prediction of relatedness in technical community question-answering forums.

Community Question Answering Multi-class Classification

SmartEmbed: A Tool for Clone and Bug Detection in Smart Contracts through Structural Code Embedding

1 code implementation22 Aug 2019 Zhipeng Gao, Vinoj Jayasundara, Lingxiao Jiang, Xin Xia, David Lo, John Grundy

In addition to the uses by individual developers, SmartEmbed can also be applied to studies of smart contracts in a large scale.

Software Engineering

Automatic Generation of Pull Request Descriptions

1 code implementation16 Sep 2019 Zhongxin Liu, Xin Xia, Christoph Treude, David Lo, Shanping Li

We build a dataset with over 41K PRs and evaluate our approach on this dataset through ROUGE and a human evaluation.

Software Engineering

TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing

no code implementations27 Oct 2019 Vinoj Jayasundara, Nghi Duy Quoc Bui, Lingxiao Jiang, David Lo

Program comprehension is a fundamental task in software development and maintenance processes.

Smart Contract Repair

1 code implementation12 Dec 2019 Xiao Liang Yu, Omar Al-Bataineh, David Lo, Abhik Roychoudhury

Our approach can be used to optimise the overall security and reliability of smart contracts against malicious attackers.

Software Engineering Cryptography and Security 68N15 D.1.2

Checking Smart Contracts with Structural Code Embedding

1 code implementation20 Jan 2020 Zhipeng Gao, Lingxiao Jiang, Xin Xia, David Lo, John Grundy

However, many bugs and vulnerabilities have been identified in many contracts which raises serious concerns about smart contract security, not to mention that the blockchain systems on which the smart contracts are built can be buggy.

Software Engineering

Automating App Review Response Generation

1 code implementation10 Feb 2020 Cuiyun Gao, Jichuan Zeng, Xin Xia, David Lo, Michael R. Lyu, Irwin King

Previous studies showed that replying to a user review usually has a positive effect on the rating that is given by the user to the app.

Response Generation

Keen2Act: Activity Recommendation in Online Social Collaborative Platforms

no code implementations11 May 2020 Roy Ka-Wei Lee, Thong Hoang, Richard J. Oentaryo, David Lo

The Act step then recommends to the user which activities to perform on the identified set of items.

Recommendation Systems

Generating Question Titles for Stack Overflow from Mined Code Snippets

1 code implementation20 May 2020 Zhipeng Gao, Xin Xia, John Grundy, David Lo, Yuan-Fang Li

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet.

Software Engineering

CodeMatcher: Searching Code Based on Sequential Semantics of Important Query Words

no code implementations29 May 2020 Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li

CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool, and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query.

Code Search Information Retrieval +1

On the Replicability and Reproducibility of Deep Learning in Software Engineering

no code implementations25 Jun 2020 Chao Liu, Cuiyun Gao, Xin Xia, David Lo, John Grundy, Xiaohu Yang

Experimental results show the importance of replicability and reproducibility, where the reported performance of a DL model could not be replicated for an unstable optimization process.

Feature Engineering

Emerging App Issue Identification via Online Joint Sentiment-Topic Tracing

no code implementations23 Aug 2020 Cuiyun Gao, Jichuan Zeng, Zhiyuan Wen, David Lo, Xin Xia, Irwin King, Michael R. Lyu

Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT in identifying emerging app issues, improving the state-of-the-art method by 22. 3% in terms of F1-score.

Clustering

What Makes a Popular Academic AI Repository?

1 code implementation6 Oct 2020 Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li

Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers.

Software Engineering

AndroEvolve: Automated Update for Android Deprecated-API Usages

1 code implementation14 Dec 2020 Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller

Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS.

Software Engineering

Smart Contract Security: a Practitioners' Perspective

no code implementations22 Feb 2021 Zhiyuan Wan, Xin Xia, David Lo, Jiachi Chen, Xiapu Luo, Xiaohu Yang

Given numerous research efforts in addressing the security issues of smart contracts, we wondered how software practitioners build security into smart contracts in practice.

Software Engineering

FACOS: Finding API Relevant Contents on Stack Overflow with Semantic and Syntactic Analysis

no code implementations14 Nov 2021 Kien Luong, Mohammad Hadi, Ferdian Thung, Fatemeh Fard, David Lo

Leveraging this observation, we develop FACOS, a context-specific algorithm to capture the semantic and syntactic information of the paragraphs and code snippets in a discussion.

Code Smells in Machine Learning Systems

no code implementations2 Mar 2022 Jiri Gesi, SiQi Liu, Jiawei Li, Iftekhar Ahmed, Nachiappan Nagappan, David Lo, Eduardo Santana de Almeida, Pavneet Singh Kochhar, Lingfeng Bao

We found that our newly identified code smells are prevalent and impactful on the maintenance of DL systems from the developer's perspective.

BIG-bench Machine Learning

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

no code implementations5 Apr 2022 Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin

Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i. e., Ruby and Java code possess very different structure.

Code Search Code Summarization

On the Effectiveness of Pretrained Models for API Learning

no code implementations5 Apr 2022 Mohammad Abdul Hadi, Imam Nur Bani Yusuf, Ferdian Thung, Kien Gia Luong, Jiang Lingxiao, Fatemeh H. Fard, David Lo

We have also identified two different tokenization approaches that can contribute to a significant boost in PTMs' performance for the API sequence generation task.

Information Retrieval Language Modelling +2

An Exploratory Study on Code Attention in BERT

no code implementations5 Apr 2022 Rishab Sharma, Fuxiang Chen, Fatemeh Fard, David Lo

When identifiers' embeddings are used in CodeBERT, a code-based PLM, the performance is improved by 21-24% in the F1-score of clone detection.

Clone Detection Code Summarization

How to Find Actionable Static Analysis Warnings: A Case Study with FindBugs

1 code implementation21 May 2022 Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies

Automatically generated static code warnings suffer from a large number of false alarms.

VulCurator: A Vulnerability-Fixing Commit Detector

1 code implementation7 Sep 2022 Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Xuan-Bach D. Le, David Lo

Open-source software (OSS) vulnerability management process is important nowadays, as the number of discovered OSS vulnerabilities is increasing over time.

Management

AutoPruner: Transformer-Based Call Graph Pruning

1 code implementation7 Sep 2022 Thanh Le-Cong, Hong Jin Kang, Truong Giang Nguyen, Stefanus Agus Haryono, David Lo, Xuan-Bach D. Le, Huynh Quyet Thang

Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph.

BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets

1 code implementation7 Oct 2022 Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang

Our experiments conducted on four tasks and four offline RL algorithms expose a disquieting fact: none of the existing offline RL algorithms is immune to such a backdoor attack.

Autonomous Driving Backdoor Attack +3

Invalidator: Automated Patch Correctness Assessment via Semantic and Syntactic Reasoning

1 code implementation3 Jan 2023 Thanh Le-Cong, Duc-Minh Luong, Xuan Bach D. Le, David Lo, Nhat-Hoa Tran, Bui Quang-Huy, Quyet-Thang Huynh

In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax.

Language Modelling Program Repair

ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems

1 code implementation11 Feb 2023 Daniel Hao Xian Yuen, Andrew Yong Chen Pang, Zhou Yang, Chun Yong Chong, Mei Kuan Lim, David Lo

To address these limitations, our tool incorporates two novel features: (1) a text transformation module to boost the number of generated test cases and uncover more errors in ASR systems and (2) a phonetic analysis module to identify on which phonemes the ASR system tend to produce errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Regret-Based Defense in Adversarial Reinforcement Learning

no code implementations14 Feb 2023 Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo

We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.

reinforcement-learning Reinforcement Learning (RL)

Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking

no code implementations16 Feb 2023 Zichong Wang, Yang Zhou, Meikang Qiu, Israat Haque, Laura Brown, Yi He, Jianwu Wang, David Lo, Wenbin Zhang

The increasing use of Machine Learning (ML) software can lead to unfair and unethical decisions, thus fairness bugs in software are becoming a growing concern.

Benchmarking counterfactual +1

A Study of Variable-Role-based Feature Enrichment in Neural Models of Code

no code implementations8 Mar 2023 Aftab Hussain, Md Rafiqul Islam Rabin, Bowen Xu, David Lo, Mohammad Amin Alipour

In this paper, we explore the impact of an unsuperivsed feature enrichment approach based on variable roles on the performance of neural models of code.

Feature Engineering

On the Usage of Continual Learning for Out-of-Distribution Generalization in Pre-trained Language Models of Code

no code implementations6 May 2023 Martin Weyssow, Xin Zhou, Kisub Kim, David Lo, Houari Sahraoui

We demonstrate that the most commonly used fine-tuning technique from prior work is not robust enough to handle the dynamic nature of APIs, leading to the loss of previously acquired knowledge i. e., catastrophic forgetting.

Continual Learning General Knowledge +1

Multi-Granularity Detector for Vulnerability Fixes

1 code implementation23 May 2023 Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, Xuan-Bach D. Le, David Lo

To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes).

Source Code Data Augmentation for Deep Learning: A Survey

1 code implementation31 May 2023 Terry Yue Zhuo, Zhou Yang, Zhensu Sun, YuFei Wang, Li Li, Xiaoning Du, Zhenchang Xing, David Lo

This paper fills this gap by conducting a comprehensive and integrative survey of data augmentation for source code, wherein we systematically compile and encapsulate existing literature to provide a comprehensive overview of the field.

Data Augmentation

Large Language Models for Software Engineering: A Systematic Literature Review

1 code implementation21 Aug 2023 Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, Haoyu Wang

Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages.

Trustworthy and Synergistic Artificial Intelligence for Software Engineering: Vision and Roadmaps

no code implementations8 Sep 2023 David Lo

For decades, much software engineering research has been dedicated to devising automated solutions aimed at enhancing developer productivity and elevating software quality.

Inferring Properties of Graph Neural Networks

no code implementations8 Jan 2024 Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu

To tackle the challenge of varying input structures in GNNs, GNNInfer first identifies a set of representative influential structures that contribute significantly towards the prediction of a GNN.

Backdoor Attack

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research

no code implementations26 Jan 2024 Sicong Cao, Xiaobing Sun, Ratnadira Widyasari, David Lo, Xiaoxue Wu, Lili Bo, Jiale Zhang, Bin Li, Wei Liu, Di wu, Yixin Chen

The remarkable achievements of Artificial Intelligence (AI) algorithms, particularly in Machine Learning (ML) and Deep Learning (DL), have fueled their extensive deployment across multiple sectors, including Software Engineering (SE).

Decision Making Vulnerability Detection

Bridging Expert Knowledge with Deep Learning Techniques for Just-In-Time Defect Prediction

no code implementations17 Mar 2024 Xin Zhou, DongGyun Han, David Lo

In addition, our experimental results confirm that the simple model and complex model are complementary to each other.

Cannot find the paper you are looking for? You can Submit a new open access paper.