SOREL-20M (Sophos/ReversingLabs-20 Million)

Introduced by Harang et al. in SOREL-20M: A Large Scale Benchmark Dataset for Malicious PE Detection

SOREL-20M is a large-scale dataset consisting of nearly 20 million files with pre-extracted features and metadata, high-quality labels derived from multiple sources, information about vendor detections of the malware samples at the time of collection, and additional “tags” related to each malware sample to serve as additional targets.

Source: SOREL-20M

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages