Clone Detection
18 papers with code • 2 benchmarks • 1 datasets
Most implemented papers
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Detecting Code Clones with Graph Neural Networkand Flow-Augmented Abstract Syntax Tree
As far as we have concerned, we are the first to apply graph neural networks on the domain of code clone detection.
Contrastive Code Representation Learning
Recent work learns contextual representations of source code by reconstructing tokens from their context.
GraphCodeBERT: Pre-training Code Representations with Data Flow
Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
Unified Pre-training for Program Understanding and Generation
Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.
Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding
In this paper, we propose an approach to bridge pre-trained models and code-related tasks.
Learning Program Semantics with Code Representations: An Empirical Study
However, currently, a comprehensive and systematic study on evaluating different program representation techniques across diverse tasks is still missed.
On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules
Although adapters are known to facilitate adapting to many downstream tasks compared to fine-tuning the model that require retraining all of the models' parameters -- which owes to the adapters' plug and play nature and being parameter efficient -- their usage in software engineering is not explored.
A Neural Network Architecture for Program Understanding Inspired by Human Behaviors
In this paper, we consider human behaviors and propose the PGNN-EK model that consists of two main components.