no code implementations • 5 Jan 2025 • Amol Khanna, Chenyi Ling, Derek Everett, Edward Raff, Nathan Inkawhich
Radial basis function networks (RBFNs) inherently link classification confidence and OOD detection; however, these networks have lost popularity due to the difficult of training them in a multi-layer fashion.
no code implementations • 20 Dec 2024 • Seyedreza Mohseni, Seyedali Mohammadi, Deepa Tilwani, Yash Saxena, Gerald Ndawula, Sriram Vema, Edward Raff, Manas Gaur
In this study, we ask the following question: Can Large Language Models (LLMs) potentially generate a new obfuscated assembly code?
no code implementations • 20 Dec 2024 • Nilanjana Das, Edward Raff, Manas Gaur
Previous research on LLM vulnerabilities often relied on nonsensical adversarial prompts, which were easily detectable by automated methods.
no code implementations • 5 Dec 2024 • Edward Raff, Michel Benaroch, Sagar Samtani, Andrew L. Farris
The concern that Artificial Intelligence (AI) and Machine Learning (ML) are entering a "reproducibility crisis" has spurred significant research in the past few years.
no code implementations • 27 Nov 2024 • Siddhant Gupta, Fred Lu, Andrew Barlow, Edward Raff, Francis Ferraro, Cynthia Matuszek, Charles Nicholas, James Holt
A strategy used by malicious actors is to "live off the land," where benign systems and tools already available on a victim's systems are used and repurposed for the malicious actor's intent.
1 code implementation • 31 Oct 2024 • Skyler Wu, Fred Lu, Edward Raff, James Holt
While such algorithms enjoy low theoretical regret, in real-world deployment they can be sensitive to individual outliers that cause the algorithm to over-correct.
1 code implementation • 30 Oct 2024 • Mohammad Mahmudul Alam, Alexander Oberle, Edward Raff, Stella Biderman, Tim Oates, James Holt
Vector Symbolic Architectures (VSAs) are one approach to developing Neuro-symbolic AI, where two vectors in $\mathbb{R}^d$ are `bound' together to produce a new vector in the same space.
1 code implementation • 30 Oct 2024 • Rebecca Saul, Chang Liu, Noah Fleischmann, Richard Zak, Kristopher Micinski, Edward Raff, James Holt
Binary analysis is a core component of many critical security tasks, including reverse engineering, malware analysis, and vulnerability detection.
no code implementations • 21 Oct 2024 • Amol Khanna, Adam McCormick, Andre Nguyen, Chris Aguirre, Edward Raff
In this article, we seek to elucidate challenges and opportunities for differential privacy within the federal government setting, as seen by a team of differential privacy researchers, privacy lawyers, and data scientists working closely with the U. S. government.
no code implementations • 20 Oct 2024 • John Hurwitz, Charles Nicholas, Edward Raff
It is generally well understood that predictive classification and compression are intrinsically related concepts in information theory.
no code implementations • 20 Aug 2024 • Ryan Swope, Amol Khanna, Philip Doldo, Saptarshi Roy, Edward Raff
Additionally, high-dimensional regression can leak information about individual datapoints in a dataset.
no code implementations • 20 Aug 2024 • Ashley Klein, Edward Raff, Elisabeth Seamon, Lily Foley, Timothy Bussert
However, patients with prelabor rupture of membranes (PROM) have only two commonly used options for cervical ripening, Pitocin and misoprostol.
no code implementations • 19 Jul 2024 • Nilanjana Das, Edward Raff, Manas Gaur
Previous research on testing the vulnerabilities in Large Language Models (LLMs) using adversarial attacks has primarily focused on nonsensical prompt injections, which are easily detected upon manual or automated review (e. g., via byte entropy).
1 code implementation • 8 Jul 2024 • Fred Lu, Ryan R. Curtin, Edward Raff, Francis Ferraro, James Holt
As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention.
1 code implementation • 17 Jun 2024 • Seyedali Mohammadi, Edward Raff, Jinendra Malekar, Vedant Palit, Francis Ferraro, Manas Gaur
Language Models (LMs) are being proposed for mental health applications where the heightened risk of adverse outcomes means predictive performance may not be a sufficient litmus test of a model's utility in clinical practice.
no code implementations • 3 Jun 2024 • Fred Lu, Ryan R. Curtin, Edward Raff, Francis Ferraro, James Holt
While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular distributed approaches can dominate as data dimensionality increases.
2 code implementations • 7 May 2024 • Chang Liu, Rebecca Saul, Yihao Sun, Edward Raff, Maya Fuchs, Townsend Southard Pantano, James Holt, Kristopher Micinski
Our results illustrate the practical need for robust corpora of high-quality Windows PE binaries in training modern learning-based binary analyses.
no code implementations • 3 May 2024 • Deepa Tilwani, Yash Saxena, Ali Mohammadi, Edward Raff, Amit Sheth, Srinivasan Parthasarathy, Manas Gaur
Automatic citation generation for sentences in a document or report is paramount for intelligence analysts, cybersecurity, news agencies, and education personnel.
no code implementations • 1 Apr 2024 • Amol Khanna, Edward Raff, Nathan Inkawhich
Linear models are ubiquitous in data science, but are particularly prone to overfitting and data memorization in high dimensions.
no code implementations • 23 Mar 2024 • Mohammad Mahmudul Alam, Edward Raff, Stella Biderman, Tim Oates, James Holt
Malware detection is an interesting and valuable domain to work in because it has significant real-world impact and unique machine-learning challenges.
no code implementations • 18 Jan 2024 • Anish Lakkapragada, Amol Khanna, Edward Raff, Nathan Inkawhich
As machine learning becomes increasingly prevalent in impactful decisions, recognizing when inference data is outside the model's expected input distribution is paramount for giving context to predictions.
Dimensionality Reduction Out of Distribution (OOD) Detection
no code implementations • 25 Dec 2023 • Tirth Patel, Fred Lu, Edward Raff, Charles Nicholas, Cynthia Matuszek, James Holt
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines, meaning a 0. 1\% change can cause an overwhelming number of false positives.
1 code implementation • 23 Dec 2023 • Mohammad Mahmudul Alam, Edward Raff, Tim Oates
While deep learning has enjoyed significant success in computer vision tasks over the past decade, many shortcomings still exist from a Cognitive Science (CogSci) perspective.
1 code implementation • 2 Dec 2023 • Mohammad Mahmudul Alam, Edward Raff, Tim Oates, Cynthia Matuszek
In the case of DDx, the proposed network has achieved a mean accuracy of 99. 82% and a mean F1 score of 0. 9472.
no code implementations • 23 Aug 2023 • Catherine Ordun, Alexandra Cha, Edward Raff, Sanjay Purushotham, Karen Kwok, Mason Rule, James Gulley
Since thermal imagery offers a unique modality to investigate pain, the U. S. National Institutes of Health (NIH) has collected a large and diverse set of cancer patient facial thermograms for AI-based pain research.
no code implementations • 25 Jul 2023 • Skyler Wu, Fred Lu, Edward Raff, James Holt
Convolutional layers have long served as the primary workhorse for image classification.
no code implementations • 28 Jun 2023 • Corey J. Nolet, Divye Gala, Alex Fender, Mahesh Doijade, Joe Eaton, Edward Raff, John Zedlewski, Brad Rees, Tim Oates
In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only $O(Nk)$ space and uses a parameter $k$ to trade off space and time.
no code implementations • 27 Jun 2023 • Tyler LeBlond, Joseph Munoz, Fred Lu, Maya Fuchs, Elliott Zaresky-Williams, Edward Raff, Brian Testa
Differential privacy (DP) is the prevailing technique for protecting user data in machine learning models.
no code implementations • 16 Jun 2023 • Edward Raff, Michel Benaroch, Andrew L. Farris
In this survey we review the current literature on attacks and their real-world occurrences, or limited evidence thereof, to critically evaluate the real-world risks of adversarial machine learning (AML) for the average entity.
no code implementations • 10 Jun 2023 • Catherine Ordun, Edward Raff, Sanjay Purushotham
For a variety of biometric cross-spectral tasks, Visible-Thermal (VT) facial pairs are used.
no code implementations • 9 Jun 2023 • Robert J. Joyce, Tirth Patel, Charles Nicholas, Edward Raff
Our work explores the potential of antivirus (AV) scan data as a scalable source of features for malware.
1 code implementation • NeurIPS 2023 • Nora Belrose, David Schneider-Joseph, Shauli Ravfogel, Ryan Cotterell, Edward Raff, Stella Biderman
Concept erasure aims to remove specified features from a representation.
1 code implementation • 31 May 2023 • Mohammad Mahmudul Alam, Edward Raff, Stella Biderman, Tim Oates, James Holt
In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains.
no code implementations • 24 Apr 2023 • Amol Khanna, Fred Lu, Edward Raff, Brian Testa
LASSO regularized logistic regression is particularly useful for its built-in feature selection, allowing coefficients to be removed from deployment and producing sparse solutions.
2 code implementations • NeurIPS 2023 • Stella Biderman, USVSN Sai Prashanth, Lintang Sutawika, Hailey Schoelkopf, Quentin Anthony, Shivanshu Purohit, Edward Raff
Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models.
4 code implementations • 3 Apr 2023 • Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal
How do large language models (LLMs) develop and evolve over the course of training?
Ranked #4 on Language Modelling on LAMBADA (Perplexity metric)
no code implementations • 18 Mar 2023 • Amol Khanna, Fred Lu, Edward Raff
Linear $L_1$-regularized models have remained one of the simplest and most effective tools in data analysis, especially in information retrieval problems where n-grams over text with TF-IDF or Okapi feature values are a strong and easy baseline.
no code implementations • 18 Feb 2023 • Catherine Ordun, Edward Raff, Sanjay Purushotham
Thermal facial imagery offers valuable insight into physiological states such as inflammation and stress by detecting emitted radiation in the infrared spectrum, which is unseen in the visible spectra.
no code implementations • 17 Feb 2023 • Luke E. Richards, Edward Raff, Cynthia Matuszek
Over the past decade, the machine learning security community has developed a myriad of defenses for evasion attacks.
no code implementations • 15 Jan 2023 • Fred Lu, Edward Raff, James Holt
Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets.
1 code implementation • 19 Dec 2022 • Zheng-Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina
We find language adaptation to be effective at improving zero-shot performance in new languages.
no code implementations • 5 Dec 2022 • Ethan M. Rudd, David Krisiloff, Scott Coull, Daniel Olszewski, Edward Raff, James Holt
In this paper, we explore the use of metric learning to embed Windows PE files in a low-dimensional vector space for downstream use in a variety of applications, including malware detection, family classification, and malware attribute tagging.
no code implementations • 23 Nov 2022 • Rebecca Saul, Mohammad Mahmudul Alam, John Hurwitz, Edward Raff, Tim Oates, James Holt
Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences.
1 code implementation • 3 Nov 2022 • Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel
We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus.
Ranked #1 on Question Answering on StoryCloze
no code implementations • 16 Oct 2022 • Fred Lu, Joseph Munoz, Maya Fuchs, Tyler LeBlond, Elliott Zaresky-Williams, Edward Raff, Francis Ferraro, Brian Testa
We present a framework to statistically audit the privacy guarantee conferred by a differentially private machine learner in practice.
no code implementations • 5 Sep 2022 • Derek Everett, Andre T. Nguyen, Luke E. Richards, Edward Raff
The quantification of uncertainty is important for the adoption of machine learning, especially to reject out-of-distribution (OOD) data back to human experts for review.
1 code implementation • 13 Jun 2022 • Mohammad Mahmudul Alam, Edward Raff, Tim Oates, James Holt
Due to the computational cost of running inference for a neural network, the need to deploy the inferential steps on a third party's compute environment or hardware is common.
no code implementations • 9 Jun 2022 • Fred Lu, Edward Raff, Francis Ferraro
Many metric learning tasks, such as triplet learning, nearest neighbor retrieval, and visualization, are treated primarily as embedding tasks where the ultimate metric is some variant of the Euclidean distance (e. g., cosine or Mahalanobis), and the algorithm must learn to embed points into the pre-chosen space.
no code implementations • 7 Jun 2022 • Michael D. Wong, Edward Raff, James Holt, Ravi Netravali
Data augmentation has been rare in the cyber security domain due to technical difficulties in altering data in a manner that is semantically consistent with the original data.
1 code implementation • 18 Apr 2022 • Katherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff
Generating and editing images from open domain text prompts is a challenging task that heretofore has required expensive and specially trained models.
no code implementations • 9 Apr 2022 • Edward Raff, Andrew L. Farris
Our argument is that this focus on code for replication is misguided if we want to improve the state of reproducible research.
1 code implementation • 8 Apr 2022 • Edward Raff
Yet to the best of our knowledge, only one work has attempted to look at this combined space, concluding that non-reproducible work is more highly cited.
no code implementations • 7 Apr 2022 • Catherine Ordun, Alexandra N. Cha, Edward Raff, Byron Gaskin, Alex Hanson, Mason Rule, Sanjay Purushotham, James L. Gulley
Cancer patients experience high rates of chronic pain throughout the treatment process.
no code implementations • 28 Feb 2022 • James Holt, Edward Raff, Ahmad Ridley, Dennis Ross, Arunesh Sinha, Diane Staheli, William Streilen, Milind Tambe, Yevgeniy Vorobeychik, Allan Wollaber
These challenges are widely studied in enterprise networks, but there are many gaps in research and practice as well as novel problems in other domains.
no code implementations • 18 Feb 2022 • Andre T. Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, James Holt
We explore the utility of information contained within a dropout based Bayesian neural network (BNN) for the task of detecting out of distribution (OOD) data.
no code implementations • 14 Feb 2022 • Fred Lu, Francis Ferraro, Edward Raff
Our method, which we term continuously generalized ordinal logistic, significantly outperforms the standard ordinal logistic model over a thorough set of ordinal regression benchmark datasets.
no code implementations • 19 Jan 2022 • Stella Biderman, Edward Raff
In this paper we explore whether transformers can be used to solve introductory level programming assignments while bypassing commonly used AI tools to detect similarities between pieces of software.
no code implementations • 28 Dec 2021 • Robert J. Joyce, Edward Raff, Charles Nicholas
Although groups of strongly correlated antivirus engines are known to exist, at present there is limited understanding of how or why these correlations came to be.
no code implementations • 27 Dec 2021 • Gaoussou Youssouf Kebe, Luke E. Richards, Edward Raff, Francis Ferraro, Cynthia Matuszek
Learning to understand grounded language, which connects natural language to percepts, is a critical research area.
1 code implementation • 29 Nov 2021 • Robert J. Joyce, Dev Amlani, Charles Nicholas, Edward Raff
Malware family classification is a significant issue with public safety and research implications that has been hindered by the high cost of expert labels.
no code implementations • 23 Sep 2021 • Robert J. Joyce, Edward Raff, Charles Nicholas
In some problem spaces, the high cost of obtaining ground truth labels necessitates use of lower quality reference datasets.
no code implementations • 23 Sep 2021 • Luke E. Richards, André Nguyen, Ryan Capps, Steven Forsythe, Cynthia Matuszek, Edward Raff
In this work we note that as studied, current transfer attack research has an unrealistic advantage for the attacker: the attacker has the exact same training data as the victim.
1 code implementation • NeurIPS 2021 • Ashwinkumar Ganesan, Hang Gao, Sunil Gandhi, Edward Raff, Tim Oates, James Holt, Mark McLean
HRRs today are not effective in a differentiable solution due to numerical instability, a problem we solve by introducing a projection step that forces the vectors to exist in a well behaved point in space.
no code implementations • 9 Aug 2021 • Andre T. Nguyen, Edward Raff, Charles Nicholas, James Holt
The detection of malware is a critical task for the protection of computing environments.
2 code implementations • 15 Jun 2021 • John Boutsikas, Maksim E. Eren, Charles Varga, Edward Raff, Cynthia Matuszek, Charles Nicholas
The use of Machine Learning has become a significant part of malware detection efforts due to the influx of new malware, an ever changing threat landscape, and the ability of Machine Learning methods to discover meaningful distinctions between malicious and benign software.
1 code implementation • 15 Jun 2021 • Catherine Ordun, Edward Raff, Sanjay Purushotham
These combined data are captured from similar sensors in order to bootstrap the training and transfer learning task, especially valuable because visible-thermal face datasets are limited.
no code implementations • 6 May 2021 • Edward Raff
K-Means++ and its distributed variant K-Means$\|$ have become de facto tools for selecting the initial seeds of K-means.
2 code implementations • 13 Apr 2021 • Corey J. Nolet, Divye Gala, Edward Raff, Joe Eaton, Brad Rees, John Zedlewski, Tim Oates
High-performance primitives for mathematical operations on sparse vectors must deal with the challenges of skewed degree distributions and limits on memory consumption that are typically not issues in dense operations.
no code implementations • 1 Mar 2021 • Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices.
1 code implementation • 17 Dec 2020 • Edward Raff
There has been increasing concern within the machine learning community that we are in a reproducibility crisis.
1 code implementation • 17 Dec 2020 • Edward Raff, William Fleshman, Richard Zak, Hyrum S. Anderson, Bobby Filar, Mark McLean
Recent works within machine learning have been tackling inputs of ever-increasing size, with cybersecurity presenting sequence classification problems of particularly extreme lengths.
no code implementations • 16 Nov 2020 • Nisha Pillai, Edward Raff, Francis Ferraro, Cynthia Matuszek
Ordering the selection of training data using active learning can lead to improvements in learning efficiently from smaller corpora.
no code implementations • 22 Oct 2020 • Edward Raff, Bobby Filar, James Holt
We propose a strategy for fixing false positives in production after a model has already been deployed.
no code implementations • 22 Sep 2020 • Catherine Ordun, Edward Raff, Sanjay Purushotham
But we also propose that thermal imagery may provide a semi-anonymous modality for computer vision, over RGB, which has been plagued by misuse in facial recognition.
1 code implementation • 6 Sep 2020 • Edward Raff, Richard Zak, Gary Lopez Munoz, William Fleming, Hyrum S. Anderson, Bobby Filar, Charles Nicholas, James Holt
Yara rules are a ubiquitous tool among cybersecurity practitioners and analysts.
no code implementations • 1 Sep 2020 • Andre T. Nguyen, Luke E. Richards, Gaoussou Youssouf Kebe, Edward Raff, Kasra Darvish, Frank Ferraro, Cynthia Matuszek
We propose a cross-modality manifold alignment procedure that leverages triplet loss to jointly learn consistent, multi-modal embeddings of language-based concepts of real-world items.
1 code implementation • 4 Aug 2020 • Maksim Ekin Eren, Nick Solovyev, Edward Raff, Charles Nicholas, Ben Johnson
The world has faced the devastating outbreak of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2), or COVID-19, in 2020.
1 code implementation • 1 Aug 2020 • Corey J. Nolet, Victor Lafargue, Edward Raff, Thejaswi Nanditale, Tim Oates, John Zedlewski, Joshua Patterson
The Uniform Manifold Approximation and Projection (UMAP) algorithm has become widely popular for its ease of use, quality of results, and support for exploratory, unsupervised, supervised, and semi-supervised learning.
no code implementations • 29 Jul 2020 • Patrick Jenkins, Rishabh Sachdeva, Gaoussou Youssouf Kebe, Padraig Higgins, Kasra Darvish, Edward Raff, Don Engel, John Winder, Francis Ferraro, Cynthia Matuszek
Grounded language acquisition -- learning how language-based interactions refer to the world around them -- is amajor area of research in robotics, NLP, and HCI.
no code implementations • 15 Jun 2020 • Edward Raff, Charles Nicholas
Malware classification is a difficult problem, to which machine learning methods have been applied for decades.
2 code implementations • 6 May 2020 • Catherine Ordun, Sanjay Purushotham, Edward Raff
As the time to retweet increases, the density of connections also increase where in our sample, we found distinct users dominating the attention of Covid19 retweeters.
4 code implementations • 30 Dec 2019 • Edward Raff, Charles Nicholas, Mark McLean
Prior work inspired by compression algorithms has described how the Burrows Wheeler Transform can be used to create a distance measure for bioinformatics problems.
1 code implementation • CVPR 2020 • Arash Rahnama, Andre T. Nguyen, Edward Raff
We treat each individual layer of the DNN as a nonlinear dynamical system and use Lyapunov theory to prove stability and robustness locally.
no code implementations • 10 Oct 2019 • Andre T. Nguyen, Edward Raff, Aaron Sant-Miller
Successful malware attacks on information technology systems can cause millions of dollars in damage, the exposure of sensitive and private information, and the irreversible destruction of data.
1 code implementation • NeurIPS 2019 • Edward Raff
What makes a paper independently reproducible?
no code implementations • 24 Aug 2019 • Andre T. Nguyen, Edward Raff
Recent work has developed Bayesian methods for the automatic statistical analysis and description of single time series as well as of homogeneous sets of time series data.
1 code implementation • 1 Aug 2019 • Edward Raff, William Fleming, Richard Zak, Hyrum Anderson, Bill Finlayson, Charles Nicholas, Mark McLean
N-grams have been a common tool for information retrieval and machine learning applications for decades.
no code implementations • 17 Jul 2019 • Arash Rahnama, Andre T. Nguyen, Edward Raff
Significant work is being done to develop the math and tools necessary to build provable defenses, or at least bounds, against adversarial attacks of neural networks.
no code implementations • CVPR 2019 • Edward Raff, Jared Sylvester, Steven Forsyth, Mark McLean
Defenses against adversarial examples, when using the ImageNet dataset, are historically easy to defeat.
no code implementations • 7 Dec 2018 • Andre T. Nguyen, Edward Raff
Adversarial attacks against neural networks in a regression setting are a critical yet understudied problem.
no code implementations • 27 Sep 2018 • Edward Raff
Artificial Intelligence and Machine Learning have become transformative to a number of industries, and as such many industries need for AI talent is increasing the demand for individuals with these skills.
no code implementations • 1 Jul 2018 • Edward Raff, Jared Sylvester
No methods currently exist for making arbitrary neural networks fair.
1 code implementation • 15 Jun 2018 • William Fleshman, Edward Raff, Jared Sylvester, Steven Forsyth, Mark McLean
Adversarial attacks against neural networks are a problem of considerable importance, for which effective defenses are not yet readily available.
no code implementations • 13 Jun 2018 • Jared Sylvester, Edward Raff
Machine learning practitioners are often ambivalent about the ethical aspects of their products.
no code implementations • 12 Jun 2018 • William Fleshman, Edward Raff, Richard Zak, Mark McLean, Charles Nicholas
As machine-learning (ML) based systems for malware detection become more prevalent, it becomes necessary to quantify the benefits compared to the more traditional anti-virus (AV) systems widely used today.
no code implementations • 30 Mar 2018 • Edward Raff, Jared Sylvester, Charles Nicholas
The Min-Hashing approach to sketching has become an important tool in data analysis, information retrial, and classification.
no code implementations • 12 Jan 2018 • Edward Raff, Charles Nicholas
In this work we explore the use of metric index structures, which accelerate nearest neighbor queries, in the scenario where we need to interleave insertions and queries during deployment.
no code implementations • 21 Dec 2017 • Edward Raff, Jared Sylvester, Steven Mills
The potential lack of fairness in the outputs of machine learning algorithms has recently gained attention both within the research community as well as in society more broadly.
7 code implementations • 25 Oct 2017 • Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, Charles Nicholas
In this work we introduce malware detection from raw byte sequences as a fruitful research area to the larger machine learning community.
2 code implementations • 5 Sep 2017 • Edward Raff, Jared Sylvester, Charles Nicholas
Many efforts have been made to use various forms of domain knowledge in malware detection.
5 code implementations • Digital Investigation 2018 • Edward Raff, Charles K. Nicholas
Recent work has proposed the Lempel-Ziv Jaccard Distance (LZJD) as a method to measure the similarity between binary byte sequences for malware classification.
Cryptography and Security