CodeXGLUE

Introduced by Lu et al. in CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

CodeXGLUE is a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark for CODE. It includes 14 datasets for 10 diversified code intelligence tasks covering the following scenarios:

code-code (clone detection, defect detection, cloze test, code completion, code repair, and code-to-code translation)
text-code (natural language code search, text-to-code generation)
code-text (code summarization)
text-text (documentation translation)

A brief summary of CodeXGLUE is provided in the figure, including tasks, datasets, language, sizes in various states, baseline systems, providers, and short definitions of each task. Datasets highlighted in BLUE are newly introduced.

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Code Summarization	CodeXGLUE - CodeSearchNet	CodeT5
Code Completion	CodeXGLUE - PY150	CodeGPT-adapted
Code Search	CodeXGLUE - AdvTest	CodeT5+ 770M
Code Completion	CodeXGLUE - Github Java Corpus	CodeGPT-adapted
Code Generation	CodeXGLUE - CodeSearchNet	Redcoder-ext
Defect Detection	CodeXGLUE - Devign	CodeT5
Text-to-Code Generation	CodeXGLUE - CONCODE	CodeT5
Code Translation	CodeXGLUE - CodeTrans	CodeT5
Clone Detection	CodeXGLUE - BigCloneBench	CodeT5
Clone Detection	CodeXGLUE - POJ-104	CodeBERT
Code Repair	CodeXGLUE - Bugs2Fix	CodeBERT
Document Translation	CodeXGLUE - Microsoft Docs	Pretrained Transformer
Cloze Test	CodeXGLUE - CT-maxmin	CodeBERT
Cloze Test	CodeXGLUE - CT-all	CodeBERT
Code Search	CodeXGLUE - WebQueryTest	CodeBERT