SPARK: Static Program Analysis Reasoning and Retrieving Knowledge

3 Nov 2017  ·  Wasuwee Sodsong, Bernhard Scholz, Sanjay Chawla ·

Program analysis is a technique to reason about programs without executing them, and it has various applications in compilers, integrated development environments, and security. In this work, we present a machine learning pipeline that induces a security analyzer for programs by example. The security analyzer determines whether a program is either secure or insecure based on symbolic rules that were deduced by our machine learning pipeline. The machine pipeline is two-staged consisting of a Recurrent Neural Networks (RNN) and an Extractor that converts an RNN to symbolic rules. To evaluate the quality of the learned symbolic rules, we propose a sampling-based similarity measurement between two infinite regular languages. We conduct a case study using real-world data. In this work, we discuss the limitations of existing techniques and possible improvements in the future. The results show that with sufficient training data and a fair distribution of program paths it is feasible to deducing symbolic security rules for the OpenJDK library with millions lines of code.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here