In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data.
We present a new program synthesis approach that combines an encoder-decoder based synthesis architecture with a differentiable program fixer.
By studying a popular, non-trivial program repair task, variable-misuse identification, we explore the relative merits of traditional and hybrid model families for code representation.
We fine-tune CuBERT on our benchmark tasks, and compare the resulting models to different variants of Word2Vec token embeddings, BiLSTM and Transformer models, as well as published state-of-the-art models, showing that CuBERT outperforms them all, even with shorter training, and with fewer labeled examples.
We show that it is beneficial to train a model that jointly and directly localizes and repairs variable-misuse bugs.
Our two SPORE algorithms are able to build relationships between responses (e. g., the execution time of a computer program) and features, and select a few from hundreds of the retrieved features to construct an explicitly sparse and non-linear model to predict the response variable.