CodeSyntax is a large-scale dataset of programs annotated with the syntactic relationships in their corresponding abstract syntax trees. It contains 18,701 code samples annotated with 1,342,050 relation edges in 43 relation types for Python, and 13,711 code samples annotated with 864,411 relation edges in 39 relation types for Java. It is designed to evaluate the performance of language models on code syntax understanding.

Source: https://paperswithcode.com/paper/benchmarking-language-models-for-code-syntax

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


Modalities


Languages