no code implementations • 29 Jan 2023 • Chad Mello, Troy Weingart, Ethan M. Rudd
We then partition this dataset into a transfer learning benchmark and demonstrate that our approach significantly reduces data collection burden per-subject.
no code implementations • 5 Dec 2022 • Ethan M. Rudd, David Krisiloff, Scott Coull, Daniel Olszewski, Edward Raff, James Holt
In this paper, we explore the use of metric learning to embed Windows PE files in a low-dimensional vector space for downstream use in a variety of applications, including malware detection, family classification, and malware attribute tagging.
no code implementations • 5 Dec 2022 • Ethan M. Rudd, Mohammad Saidur Rahman, Philip Tully
We implement transformer models for two distinct InfoSec data formats - specifically URLs and PE files - in a novel end-to-end approach, and explore a variety of architectural designs, training regimes, and experimental settings to determine the ingredients necessary for performant detection models.
2 code implementations • 14 Dec 2020 • Richard Harang, Ethan M. Rudd
In this paper we describe the SOREL-20M (Sophos/ReversingLabs-20 Million) dataset: a large-scale dataset consisting of nearly 20 million files with pre-extracted features and metadata, high-quality labels derived from multiple sources, information about vendor detections of the malware samples at the time of collection, and additional ``tags'' related to each malware sample to serve as additional targets.
Cryptography and Security
1 code implementation • 5 Nov 2020 • Ethan M. Rudd, Ahmed Abdallah
Machine Learning (ML) for information security (InfoSec) utilizes distinct data types and formats which require different treatments during optimization/training on raw data.
1 code implementation • 16 May 2019 • Adarsh Kyadige, Ethan M. Rudd, Konstantin Berlin
In this paper, we propose utilizing a static source of contextual information -- the path of the PE file -- as an auxiliary input to the classifier.
2 code implementations • 15 May 2019 • Felipe N. Ducau, Ethan M. Rudd, Tad M. Heppner, Alex Long, Konstantin Berlin
With the rapid proliferation and increased sophistication of malicious software (malware), detection methods no longer rely only on manually generated signatures but have also incorporated more general approaches like machine learning detection.
1 code implementation • 13 Mar 2019 • Ethan M. Rudd, Felipe N. Ducau, Cody Wild, Konstantin Berlin, Richard Harang
In this work, we fit deep neural networks to multiple additional targets derived from metadata in a threat intelligence feed for Portable Executable (PE) malware and benignware, including a multi-source malicious/benign loss, a count loss on multi-source detections, and a semantic malware attribute tag loss.
no code implementations • 29 Oct 2018 • Richard Harang, Ethan M. Rudd
When the cost of misclassifying a sample is high, it is useful to have an accurate estimate of uncertainty in the prediction for that sample.
no code implementations • 4 Jan 2018 • Andras Rozsa, Manuel Günther, Ethan M. Rudd, Terrance E. Boult
Facial attributes, emerging soft biometrics, must be automatically and reliably extracted from images in order to be usable in stand-alone systems.
no code implementations • 3 May 2017 • Manuel Günther, Steve Cruz, Ethan M. Rudd, Terrance E. Boult
In this paper, we address the widespread misconception that thresholding verification-like scores is a good way to solve the open-set face identification problem, by formulating an open-set face identification protocol and evaluating different strategies for assessing similarity.
no code implementations • 21 Oct 2016 • Khudran Alzhrani, Ethan M. Rudd, Terrance E. Boult, C. Edward Chow
To analyze the ACESS system, we constructed a novel dataset, containing formerly classified paragraphs from diplomatic cables made public by the WikiLeaks organization.
no code implementations • 18 May 2016 • Andras Rozsa, Manuel Günther, Ethan M. Rudd, Terrance E. Boult
We show that FFA generates more adversarial examples than other related algorithms, and that DCNNs for certain attributes are generally robust to adversarial inputs, while DCNNs for other attributes are not.
no code implementations • 10 May 2016 • Ethan M. Rudd, Manuel Gunther, Terrance E. Boult
Thus, there is great demand for an effective and low cost system capable of rejecting such attacks. To this end we introduce PARAPH -- a novel hardware extension that exploits different measurements of light polarization to yield an image space in which presentation media are readily discernible from Bona Fide facial characteristics.
no code implementations • 5 May 2016 • Andras Rozsa, Ethan M. Rudd, Terrance E. Boult
Finally, we demonstrate on LeNet and GoogLeNet that fine-tuning with a diverse set of hard positives improves the robustness of these networks compared to training with prior methods of generating adversarial images.
no code implementations • 19 Mar 2016 • Ethan M. Rudd, Andras Rozsa, Manuel Günther, Terrance E. Boult
While machine learning offers promising potential for increasingly autonomous solutions with improved generalization to new malware types, both at the network level and at the host level, our findings suggest that several flawed assumptions inherent to most recognition algorithms prevent a direct mapping between the stealth malware recognition problem and a machine learning solution.
no code implementations • 19 Jun 2015 • Ethan M. Rudd, Lalit P. Jain, Walter J. Scheirer, Terrance E. Boult
It is often desirable to be able to recognize when inputs to a recognition function learned in a supervised manner correspond to classes unseen at training time.