Synthesizing Program Input Grammars

5 Aug 2016  ·  Osbert Bastani, Rahul Sharma, Alex Aiken, Percy Liang ·

We present an algorithm for synthesizing a context-free grammar encoding the language of valid program inputs from a set of input examples and blackbox access to the program. Our algorithm addresses shortcomings of existing grammar inference algorithms, which both severely overgeneralize and are prohibitively slow. Our implementation, GLADE, leverages the grammar synthesized by our algorithm to fuzz test programs with structured inputs. We show that GLADE substantially increases the incremental coverage on valid inputs compared to two baseline fuzzers.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper