xCodeEval is one of the largest executable multilingual multitask benchmarks consisting of 17 programming languages with execution-level parallelism. It features a total of seven tasks involving code understanding, generation, translation, and retrieval, and it employs an execution-based evaluation instead of traditional lexical approaches. It also provides a test-case-based multilingual code execution engine, ExecEval that supports all the programming languages in xCodeEval.
6 PAPERS • NO BENCHMARKS YET
Defects4J is a collection of reproducible bugs and a supporting infrastructure with the goal of advancing software engineering research.
2 PAPERS • NO BENCHMARKS YET