Search Results for author: Richard Harang

Found 7 papers, 4 papers with code

SOREL-20M: A Large Scale Benchmark Dataset for Malicious PE Detection

2 code implementations14 Dec 2020 Richard Harang, Ethan M. Rudd

In this paper we describe the SOREL-20M (Sophos/ReversingLabs-20 Million) dataset: a large-scale dataset consisting of nearly 20 million files with pre-extracted features and metadata, high-quality labels derived from multiple sources, information about vendor detections of the malware samples at the time of collection, and additional ``tags'' related to each malware sample to serve as additional targets.

Cryptography and Security

ALOHA: Auxiliary Loss Optimization for Hypothesis Augmentation

1 code implementation13 Mar 2019 Ethan M. Rudd, Felipe N. Ducau, Cody Wild, Konstantin Berlin, Richard Harang

In this work, we fit deep neural networks to multiple additional targets derived from metadata in a threat intelligence feed for Portable Executable (PE) malware and benignware, including a multi-source malicious/benign loss, a count loss on multi-source detections, and a semantic malware attribute tag loss.

Attribute Malware Detection +1

Towards Principled Uncertainty Estimation for Deep Neural Networks

no code implementations29 Oct 2018 Richard Harang, Ethan M. Rudd

When the cost of misclassifying a sample is high, it is useful to have an accurate estimate of uncertainty in the prediction for that sample.

Bayesian Inference

Git Blame Who?: Stylistic Authorship Attribution of Small, Incomplete Source Code Fragments

no code implementations20 Jan 2017 Edwin Dauber, Aylin Caliskan, Richard Harang, Gregory Shearer, Michael Weisman, Frederica Nelson, Rachel Greenstadt

We show that we can also use these calibration curves in the case that we do not have linking information and thus are forced to classify individual samples directly.

Authorship Attribution

Crafting Adversarial Input Sequences for Recurrent Neural Networks

1 code implementation28 Apr 2016 Nicolas Papernot, Patrick McDaniel, Ananthram Swami, Richard Harang

Machine learning models are frequently used to solve complex security problems, as well as to make decisions in sensitive situations like guiding autonomous vehicles or predicting financial market behaviors.

Autonomous Vehicles BIG-bench Machine Learning +1

When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries

3 code implementations28 Dec 2015 Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt, Arvind Narayanan

Many distinguishing features present in source code, e. g. variable names, are removed in the compilation process, and compiler optimization may alter the structure of a program, further obscuring features that are known to be useful in determining authorship.

Cryptography and Security

Cannot find the paper you are looking for? You can Submit a new open access paper.