no code implementations • 16 Apr 2024 • Ahmed E. Hassan, Gustavo A. Oliva, Dayi Lin, Boyuan Chen, Zhen Ming, Jiang
We envision AI pair programmers that are goal-driven, human partners, SE-aware, and self-learning.
1 code implementation • 27 Mar 2024 • Harsh Patel, Dominique Boucher, Emad Fallahzadeh, Ahmed E. Hassan, Bram Adams
This paper investigates the complexities of integrating Large Language Models (LLMs) into software products, with a focus on the challenges encountered for determining their readiness for release.
1 code implementation • 25 Feb 2024 • Zhimin Zhao, Yihao Chen, Abdul Ali Bangash, Bram Adams, Ahmed E. Hassan
In machine learning (ML), efficient asset management, including ML models, datasets, algorithms, and tools, is vital for resource optimization, consistent performance, and a streamlined development lifecycle.
no code implementations • 25 Feb 2024 • Ahmed E. Hassan, Dayi Lin, Gopi Krishnan Rajbahadur, Keheliya Gallaba, Filipe R. Cogo, Boyuan Chen, Haoxiang Zhang, Kishanthan Thangarajah, Gustavo Ansaldi Oliva, Jiahuei Lin, Wali Mohammad Abdullah, Zhen Ming Jiang
Finally, we (i) show how the unique properties of FMArts enabled us to design and develop a complex FMware for a large customer in a timely manner and (ii) discuss the lessons that we learned in doing so.
1 code implementation • 19 Dec 2023 • Moses Openja, Foutse khomh, Armstrong Foundjem, Zhen Ming, Jiang, Mouna Abidi, Ahmed E. Hassan
Aims: To fill this gap, we perform the first fine-grained empirical study on ML testing practices in the wild, to identify the ML properties being tested, the followed testing strategies, and their implementation throughout the ML workflow.
1 code implementation • 23 May 2023 • Truong Giang Nguyen, Thanh Le-Cong, Hong Jin Kang, Ratnadira Widyasari, Chengran Yang, Zhipeng Zhao, Bowen Xu, Jiayuan Zhou, Xin Xia, Ahmed E. Hassan, Xuan-Bach D. Le, David Lo
To address these challenges and boost the effectiveness of prior works, we propose MiDas (Multi-Granularity Detector for Vulnerability Fixes).
no code implementations • 28 Nov 2022 • Mohamed Raed El aoun, Lionel Nganyewou Tidjon, Ben Rombaut, Foutse khomh, Ahmed E. Hassan
In this paper, we present a qualitative and quantitative analysis of the most frequent dl libraries combination, the distribution of dl library dependencies across the ml workflow, and formulate a set of recommendations to (i) hardware builders for more optimized accelerators and (ii) library builder for more refined future releases.
no code implementations • 12 Feb 2022 • Gopi Krishnan Rajbahadur, Shaowei Wang, Yasutaka Kamei, Ahmed E. Hassan
We find that: i) Random forest based classifiers outperform other classifiers (best AUC) for both classifier building approaches; ii) In contrast to common practice, building a defect classifier using discretized defect counts (i. e., discretized classifiers) does not always lead to better performance.
no code implementations • 12 Feb 2022 • Gopi Krishnan Rajbahadur, Shaowei Wang, Yasutaka Kamei, Ahmed E. Hassan
Researchers usually discretize a continuous dependent variable into two target classes by introducing an artificial discretization threshold (e. g., median).
no code implementations • 4 Feb 2022 • Gopi Krishnan Rajbahadur, Shaowei Wang, Yasutaka Kamei, Ahmed E. Hassan
We further observe that the commonly used defect datasets are rife with feature interactions and these feature interactions impact the computed feature importance ranks of the CS methods (not the CA methods).
1 code implementation • 6 Oct 2020 • Yuanrui Fan, Xin Xia, David Lo, Ahmed E. Hassan, Shanping Li
Hence, in this study, we perform an empirical study on academic AI repositories to highlight good software engineering practices of popular academic AI repositories for AI researchers.
Software Engineering
no code implementations • 29 May 2020 • Chao Liu, Xin Xia, David Lo, Zhiwei Liu, Ahmed E. Hassan, Shanping Li
CodeMatcher first collects metadata for query words to identify irrelevant/noisy ones, then iteratively performs fuzzy search with important query words on the codebase that is indexed by the Elasticsearch tool, and finally reranks a set of returned candidate code according to how the tokens in the candidate code snippet sequentially matched the important words in a query.