The MT40K dataset for predicting malware threat intelligence is a collection of 40,000 triples generated from 27,354 unique entities and 34 relations. The corpus consists of approximately 1,100 de-identified plain text threat reports written between 2006-2021 and all CVE vulnerability descriptions created between 1990 to 2021. The annotated keyphrases were classified into entities derived from semantic categories defined in malware threat ontologies.
Paper | Code | Results | Date | Stars |
---|