Source Code Summarization

37 papers with code • 9 benchmarks • 7 datasets

Code Summarization is a task that tries to comprehend code and automatically generate descriptions directly from the source code.

Source: Improving Automatic Source Code Summarization via Deep Reinforcement Learning

Benchmarks

Add a Result

These leaderboards are used to track progress in Source Code Summarization

Dataset	Best Model	Compare
DeepCom-Java	AdaMo-noise	See all
ParallelCorpus-Python	AdaMo-basic	See all
CodeSearchNet	ContraCode	See all
Summarizing Source Code using a Neural Attention Model - C#	CodeTrans-MT-Large	See all
Summarizing Source Code using a Neural Attention Model - Python	CodeTrans-MT-Base	See all
Summarizing Source Code using a Neural Attention Model - SQL	CodeTrans-MT-TF-Large	See all
CoDesc	Transformer	See all
Java scripts	AdaMo-basic	See all
CodeSearchNet - Python	AdaMo-basic	See all

Libraries

Use these libraries to find Source Code Summarization models and implementations

transms/m2ts

2 papers

Datasets

Subtasks

Method name prediction

Most implemented papers

Most implemented Social Latest No code

CoDesc: A Large Code-Description Parallel Dataset

csebuetnlp/CoDesc • • 29 May 2021

In this study, we present CoDesc -- a large parallel dataset composed of 4. 2 million Java methods and natural language descriptions.

Paper
Code

Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors

SageSELab/CodeSumStudy • • ACL (NLP4Prog) 2021

Automated source code summarization is a popular software engineering research topic wherein machine translation models are employed to "translate" code snippets into relevant natural language descriptions.

Paper
Code

On the Evaluation of Neural Code Summarization

DeepSoftwareAnalytics/CodeSumEvaluation • 15 Jul 2021

To achieve a profound understanding of how far we are from solving this problem and provide suggestions to future research, in this paper, we conduct a systematic and in-depth analysis of 5 state-of-the-art neural code summarization models on 6 widely used BLEU variants, 4 pre-processing operations and their combinations, and 3 widely used datasets.

Paper
Code

GraphSearchNet: Enhancing GNNs via Capturing Global Dependencies for Semantic Code Search

shangqing-liu/graphsearchnet • • 4 Nov 2021

Specifically, we propose to construct graphs for the source code and queries with bidirectional GGNN (BiGGNN) to capture the local structural information of the source code and queries.

Paper
Code

Modeling Hierarchical Syntax Structure with Triplet Position for Source Code Summarization

gjcexp/codescribe • • ACL ARR November 2021

In this paper, we propose CODESCRIBE to model the hierarchical syntax structure of code by introducing a novel triplet position for code summarization.

Paper
Code

Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow

scam2021-so/scam2021 • 27 Nov 2021

Automated source code summarization is a task that generates summarized information about the purpose, usage, and--or implementation of methods and classes to support understanding of these code entities.

Paper
Code