The MT40K dataset for predicting malware threat intelligence is a collection of 40,000 triples generated from 27,354 unique entities and 34 relations. The corpus consists of approximately 1,100 de-identified plain text threat reports written between 2006-2021 and all CVE vulnerability descriptions created between 1990 to 2021. The annotated keyphrases were classified into entities derived from semantic categories defined in malware threat ontologies.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages