DeepFix consists of a program repair dataset (fix compiler errors in C programs). It enables research around automatically fixing programming errors using deep learning.
37 PAPERS • 1 BENCHMARK
xCodeEval is one of the largest executable multilingual multitask benchmarks consisting of 17 programming languages with execution-level parallelism. It features a total of seven tasks involving code understanding, generation, translation, and retrieval, and it employs an execution-based evaluation instead of traditional lexical approaches. It also provides a test-case-based multilingual code execution engine, ExecEval that supports all the programming languages in xCodeEval.
6 PAPERS • NO BENCHMARKS YET
Repair AST parse (syntax) errors in Python code
4 PAPERS • 1 BENCHMARK
The dataset contains more than 100k code patch pairs extracted from open source projects on GitHub. Each pair comes with the erroneous and the fixed version of the corresponding code snippet. Instead of the whole file, the code snippets are extracted to focus on the problematic region (error line + other lines around it). For each sample, the repository name, the commit id, and the file names are provided so that one can access the complete files in case of interest.
1 PAPER • 1 BENCHMARK