no code implementations • 3 Jan 2024 • Md Rayhanur Rahman, Brandon Wroblewski, Quinn Matthews, Brantley Morgan, Tim Menzies, Laurie Williams
The goal of this paper is to aid security practitioners in prioritizing and proactive defense against cyberattacks by mining temporal attack patterns from cyberthreat intelligence reports.
1 code implementation • 3 Feb 2023 • Huy Tu, Tim Menzies
Standard SSL algorithms use ``weak'' knowledge (i. e. those not based on specific SE knowledge) such as (e. g.) co-train two learners and use good labels from one to train the other.
no code implementations • 25 Jan 2023 • Lauren Alvarez, Tim Menzies
STEALTH is a method for using some AI-generated model, without suffering from malicious attacks (i. e. lying) or associated unfairness issues.
1 code implementation • 16 Jan 2023 • Andre Lustosa, Tim Menzies
For example, for project health indicators such as $C$= number of commits; $I$=number of closed issues, and $R$=number of closed pull requests, niSNEAK's 12 month prediction errors are \{I=0\%, R=33\%\, C=47\%\} Based on the above, we recommend landscape analytics (e. g. niSNEAK) especially when learning from very small data sets.
no code implementations • 18 Nov 2022 • Guanqin Zhang, Jiankun Sun, Feng Xu, H. M. N. Dilum Bandara, Shiping Chen, Yulei Sui, Tim Menzies
Deep neural networks (DNNs), are widely used in many industries such as image recognition, supply chain, medical diagnosis, and autonomous driving.
2 code implementations • 10 Nov 2022 • Suvodeep Majumder, Joymallya Chakraborty, Tim Menzies
Hence, there are often limits on how much-labeled data is available for training.
1 code implementation • 21 May 2022 • Rahul Yedida, Hong Jin Kang, Huy Tu, Xueqi Yang, David Lo, Tim Menzies
Automatically generated static code warnings suffer from a large number of false alarms.
no code implementations • 22 Mar 2022 • Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies
Conclusion: Based on this study, we would suggest the use of optimized GANs as an alternative method for security vulnerability data class imbalanced issues.
no code implementations • 25 Jan 2022 • Huy Tu, Tim Menzies
The human experts are then required to read almost a quintuple of the SATD comments which indicates the inefficiency of the tool.
1 code implementation • 25 Oct 2021 • Suvodeep Majumder, Joymallya Chakraborty, Gina R. Bai, Kathryn T. Stolee, Tim Menzies
In summary, to simplify the fairness testing problem, we recommend the following steps: (1)~determine what type of fairness is desirable (and we offer a handful of such types); then (2) lookup those types in our clusters; then (3) just test for one item per cluster.
no code implementations • 3 Oct 2021 • Kewen Peng, Joymallya Chakraborty, Tim Menzies
Our approach aims to offset the biased predictions of the classification model via rebalancing the distribution of protected attributes.
no code implementations • 29 Sep 2021 • Rahul Yedida, Rahul Krishna, Anup Kalia, Tim Menzies, Jin Xiao, Maja Vukovic
When services are divided into many independent components, they are easier to update.
1 code implementation • 22 Aug 2021 • Huy Tu, Tim Menzies
However, prior work has shown that such requirements can be expensive, taking several weeks to label thousands of commits, and not always available when traversing new research problems and domains.
1 code implementation • 17 Jul 2021 • Zhe Yu, Joymallya Chakraborty, Tim Menzies
We found that equalizing the class distribution in each demographic group with sample weights is a necessary condition for achieving equalized odds without modifying the normal training process.
no code implementations • 11 Jul 2021 • Tim Menzies, Kewen Peng, Andre Lustosa
Can we simplify explanations for software analytics?
2 code implementations • 25 May 2021 • Joymallya Chakraborty, Suvodeep Majumder, Tim Menzies
This paper postulates that the root causes of bias are the prior decisions that affect- (a) what data was selected and (b) the labels assigned to those examples.
2 code implementations • 24 May 2021 • N. C. Shrikanth, Tim Menzies
Moreover, using this early bird method, we have shown that a simple model (with just a few features) generalizes to hundreds of projects.
1 code implementation • 15 Jan 2021 • Rahul Yedida, Xueqi Yang, Tim Menzies
We test the hypothesis laid by Galke and Scherp [18], that feedforward networks suffice for many analytics tasks (which we call, the "Old but Gold" hypothesis) for these two tasks.
1 code implementation • 26 Nov 2020 • N. C. Shrikanth, Suvodeep Majumder, Tim Menzies
Hence, defect predictors learned from the first 150 commits and four months perform just as well as anything else.
no code implementations • 23 Nov 2020 • Rui Shu, Tianpei Xia, Laurie Williams, Tim Menzies
Conclusion: When employing ensemble defense against adversarial evasion attacks, we suggest creating an ensemble with unexpected models that are distant from the attacker's expected model (i. e., target model) through methods such as hyperparameter optimization.
1 code implementation • 7 Oct 2020 • Paul Ralph, Nauman bin Ali, Sebastian Baltes, Domenico Bianculli, Jessica Diaz, Yvonne Dittrich, Neil Ernst, Michael Felderer, Robert Feldt, Antonio Filieri, Breno Bernard Nicolau de França, Carlo Alberto Furia, Greg Gay, Nicolas Gold, Daniel Graziotin, Pinjia He, Rashina Hoda, Natalia Juristo, Barbara Kitchenham, Valentina Lenarduzzi, Jorge Martínez, Jorge Melegati, Daniel Mendez, Tim Menzies, Jefferson Molleri, Dietmar Pfahl, Romain Robbes, Daniel Russo, Nyyti Saarimäki, Federica Sarro, Janet Siegmund, Diomidis Spinellis, Miroslaw Staron, Klaas Stol, Margaret-Anne Storey, Davide Taibi, Damian Tamburri, Marco Torchiano, Christoph Treude, Burak Turhan, XiaoFeng Wang, Sira Vegas
Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e. g. a questionnaire survey).
Software Engineering General Literature
no code implementations • 21 Aug 2020 • Suvodeep Majumder, Pranav Mody, Tim Menzies
We find that some analytics in-the-small conclusions still hold when scaling up to analytics in-the-large.
no code implementations • 3 Aug 2020 • Xiao Ling, Rishabh Agrawal, Tim Menzies
Improved test case prioritization means that software developers can detect and fix more software faults sooner than usual.
Software Engineering
2 code implementations • 25 Feb 2020 • Zhe Yu, Fahmid Morshed Fahid, Huy Tu, Tim Menzies
Keeping track of and managing the self-admitted technical debts (SATDs) is important to maintaining a healthy software project.
Software Engineering
no code implementations • 9 Dec 2019 • Amritanshu Agrawal, Xueqi Yang, Rishabh Agrawal, Rahul Yedida, Xipeng Shen, Tim Menzies
How can we make software analytics simpler and faster?
1 code implementation • 6 Nov 2019 • Suvodeep Majumder, Tianpei Xia, Rahul Krishna, Tim Menzies
To the best of our knowledge, STABILIZER is order of magnitude faster than the prior state-of-the-art transfer learners which seek to find conclusion stability, and these case studies are the largest demonstration of the generalizability of quantitative predictions of project quality yet reported in the SE literature.
no code implementations • 4 Nov 2019 • Rui Shu, Tianpei Xia, Jianfeng Chen, Laurie Williams, Tim Menzies
For example, in a study of security bug reports from the Chromium dataset, the median recalls of FARSEC and Swift were 15. 7% and 77. 4%, respectively.
Software Engineering
2 code implementations • 1 Nov 2019 • Rahul Krishna, Vivek Nair, Pooyan Jamshidi, Tim Menzies
To resolve these problems, we propose a novel transfer learning framework called BEETLE, which is a "bellwether"-based transfer learner that focuses on identifying and learning from the most relevant source from amongst the old data.
Software Engineering
1 code implementation • 20 May 2019 • Fahmid M. Fahid, Zhe Yu, Tim Menzies
Specifically, for ten open-source JAVA projects, we can find 83% of the technical debt via SURVEY0 using just 16% of the comments (and if higher levels of recall are required, SURVEY0can adjust towards that with some additional effort).
2 code implementations • 16 May 2019 • Zhe Yu, Fahmid M. Fahid, Tim Menzies, Gregg Rothermel, Kyle Patrick, Snehit Cherian
Given that much of the automated UI testing is "black box" in nature, very little information (only the test case descriptions and testing results) can be utilized to prioritize these automated UI test cases.
no code implementations • 14 May 2019 • Joymallya Chakraborty, Tianpei Xia, Fahmid M. Fahid, Tim Menzies
To the best of our knowledge, this is the first application of hyperparameter optimization as a tool for software engineers to generate fairer software.
no code implementations • 5 Feb 2019 • Amritanshu Agrawal, Wei Fu, Di Chen, Xipeng Shen, Tim Menzies
Machine learning techniques applied to software engineering tasks can be improved by hyperparameter optimization, i. e., automatic tools that find good settings for a learner's control parameters.
1 code implementation • 28 Apr 2018 • Tianpei Xia, Rahul Krishna, Jianfeng Chen, George Mathew, Xipeng Shen, Tim Menzies
We test OIL on a wide range of hyperparameter optimizers using data from 945 software projects.
Software Engineering
3 code implementations • 11 Mar 2018 • Vivek Nair, Rahul Krishna, Tim Menzies, Pooyan Jamshidi
Using this insight, this paper proposes BEETLE, a novel bellwether based transfer learning scheme, which can identify a suitable source and use it to find near-optimal configurations of a software system.
Software Engineering
no code implementations • 14 Feb 2018 • Suvodeep Majumder, Nikhila Balaji, Katie Brey, Wei Fu, Tim Menzies
Deep learners utilizes extensive computational power and can take a long time to train-- making it difficult to widely validate and repeat and improve their results.
1 code implementation • 7 Jan 2018 • Vivek Nair, Zhe Yu, Tim Menzies, Norbert Siegmund, Sven Apel
FLASH scales up to software systems that defeat the prior state of the art model-based methods in this area.
Software Engineering
no code implementations • 27 Aug 2017 • Jianfeng Chen, Tim Menzies
Cloud computing provides engineers or scientists a place to run complex computing tasks.
1 code implementation • 17 Aug 2017 • Rahul Krishna, Tim Menzies
The current generation of software analytics tools are mostly prediction algorithms (e. g. support vector machines, naive bayes, logistic regression, etc).
Software Engineering
1 code implementation • 1 Mar 2017 • Wei Fu, Tim Menzies
(1) There is much variability in the efficacy of the Yang et al. predictors so even with their approach, some supervised data is required to prune weaker predictors away.
1 code implementation • 1 Mar 2017 • Wei Fu, Tim Menzies
While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost.
no code implementations • 27 Jan 2017 • Vivek Nair, Tim Menzies, Norbert Siegmund, Sven Apel
Despite the huge spread and economical importance of configurable software systems, there is unsatisfactory support in utilizing the full potential of these systems with respect to finding performance-optimal configurations.
no code implementations • 27 Jan 2017 • Jianfeng Chen, Vivek Nair, Tim Menzies
Context: Evolutionary algorithms typically require a large number of evaluations (of solutions) to converge - which can be very slow and expensive to evaluate. Objective: To solve search-based software engineering (SE) problems, using fewer evaluations than evolutionary methods. Method: Instead of mutating a small population, we build a very large initial population which is then culled using a recursive bi-clustering chop approach.
1 code implementation • 10 Dec 2016 • Zhe Yu, Nicholas A. Kraft, Tim Menzies
Literature reviews can be time-consuming and tedious to complete.
no code implementations • 8 Sep 2016 • Wei Fu, Vivek Nair, Tim Menzies
In software analytics, at least for defect prediction, several methods, like grid search and differential evolution (DE), have been proposed to learn these parameters, which has been proved to be able to improve the performance scores of learners.
no code implementations • 2 Sep 2016 • Morakot Choetkiertikul, Hoa Khanh Dam, Truyen Tran, Trang Pham, Aditya Ghose, Tim Menzies
Although there has been substantial research in software analytics for effort estimation in traditional software projects, little work has been done for estimation in agile projects, especially estimating user stories or issues.
no code implementations • 29 Aug 2016 • Amritanshu Agrawal, Wei Fu, Tim Menzies
When run on different datasets, LDA suffers from "order effects" i. e. different topics are generated if the order of training data is shuffled.