Convolutional Neural Network for Classification of Malware Assembly Code
Traditional signature-based methods have started becoming inadequnate to deal with next generation malware which utilize sophisticated obfuscation (polymorphic and metamorphic) techniques to evade detection. Recently, research efforts have been conducted on malware detection and classification by applying machine learning techniques. Despite them, most methods are build on shallow learning architectures and rely on the extraction of hand-crafted features. In this paper, based on assembly language code extracted from disassembled binary files and embedded into vectors, we present a convolutional neural network architecture to learn a set of discriminative patterns able to cluster malware files amongst families. To demonstrate the suitability of our approach we evaluated our model on the data provided by Microsoft for the BigData Innovators Gathering 2015 Anti-Malware Prediction Challenge. Experiments show that the method achieves competitive results without relying on the manual extraction of features and is resilient to the most common obfuscation techniques.
PDF