A labeled benchmark dataset for training machine learning models to statically detect malicious Windows portable executable files. The dataset includes features extracted from 1.1M binary files: 900K training samples (300K malicious, 300K benign, 300K unlabeled) and 200K test samples (100K malicious, 100K benign).

Source: EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


Modalities


Languages