Source Code Summarization
37 papers with code • 9 benchmarks • 7 datasets
Code Summarization is a task that tries to comprehend code and automatically generate descriptions directly from the source code.
Source: Improving Automatic Source Code Summarization via Deep Reinforcement Learning
Libraries
Use these libraries to find Source Code Summarization models and implementationsDatasets
Most implemented papers
Retrieval-Augmented Generation for Code Summarization via Hybrid GNN
However, automatic code summarization is challenging due to the complexity of the source code and the language gap between the source code and natural language summaries.
Contrastive Code Representation Learning
Recent work learns contextual representations of source code by reconstructing tokens from their context.
GraphCodeBERT: Pre-training Code Representations with Data Flow
Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
Code Summarization with Structure-induced Transformer
Code summarization (CS) is becoming a promising area in recent language understanding, which aims to generate sensible human language automatically for programming language in the format of source code, serving in the most convenience of programmer developing.
Neural Code Summarization
Code summarization is the task of generating readable summaries that are semantically meaningful and can accurately describe the presumed task of a software.
Unified Pre-training for Program Understanding and Generation
Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.
Improving Code Summarization with Block-wise Abstract Syntax Tree Splitting
In this paper, we present the Block-wise Abstract Syntax Tree Splitting method (BASTS for short), which fully utilizes the rich tree-form syntax structure in ASTs, for improving code summarization.
Language-Agnostic Representation Learning of Source Code from Structure and Context
Source code (Context) and its parsed abstract syntax tree (AST; Structure) are two complementary representations of the same computer program.
Project-Level Encoding for Neural Source Code Summarization of Subroutines
Source code summarization of a subroutine is the task of writing a short, natural language description of that subroutine.
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Simultaneously, the transformer model, especially its combination with transfer learning, has been proven to be a powerful technique for natural language processing tasks.