Non-negative matrix factorization (NMF) with missing-value completion is a well-known effective Collaborative Filtering (CF) method used to provide personalized user recommendations.
We explore the utility of information contained within a dropout based Bayesian neural network (BNN) for the task of detecting out of distribution (OOD) data.
Although groups of strongly correlated antivirus engines are known to exist, at present there is limited understanding of how or why these correlations came to be.
Malware family classification is a significant issue with public safety and research implications that has been hindered by the high cost of expert labels.
In some problem spaces, the high cost of obtaining ground truth labels necessitates use of lower quality reference datasets.
The detection of malware is a critical task for the protection of computing environments.
The unprecedented outbreak of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), or COVID-19, continues to be a significant worldwide problem.
The use of Machine Learning has become a significant part of malware detection efforts due to the influx of new malware, an ever changing threat landscape, and the ability of Machine Learning methods to discover meaningful distinctions between malicious and benign software.
Yara rules are a ubiquitous tool among cybersecurity practitioners and analysts.
Malware classification is a difficult problem, to which machine learning methods have been applied for decades.
Prior work inspired by compression algorithms has described how the Burrows Wheeler Transform can be used to create a distance measure for bioinformatics problems.
N-grams have been a common tool for information retrieval and machine learning applications for decades.
As machine-learning (ML) based systems for malware detection become more prevalent, it becomes necessary to quantify the benefits compared to the more traditional anti-virus (AV) systems widely used today.
The Min-Hashing approach to sketching has become an important tool in data analysis, information retrial, and classification.
In this work we explore the use of metric index structures, which accelerate nearest neighbor queries, in the scenario where we need to interleave insertions and queries during deployment.
In this work we introduce malware detection from raw byte sequences as a fruitful research area to the larger machine learning community.