Clone Detection
20 papers with code • 2 benchmarks • 1 datasets
Most implemented papers
UniASM: Binary Code Similarity Detection without Fine-tuning
Binary code similarity detection (BCSD) is widely used in various binary analysis tasks such as vulnerability search, malware detection, clone detection, and patch analysis.
Understanding Programs by Exploiting (Fuzzing) Test Cases
The effectiveness of the proposed method is verified on two program understanding tasks including code clone detection and code classification, and it outperforms current state-of-the-arts by large margins.
Graph Neural Networks For Mapping Variables Between Programs -- Extended Version
Typically, in order to compare two programs, a relation between both programs' sets of variables is required.
ZC3: Zero-Shot Cross-Language Code Clone Detection
In this paper, we propose a novel method named ZC3 for Zero-shot Cross-language Code Clone detection.
Cloning and Beyond: A Quantum Solution to Duplicate Code
However, the kinds of problems for which these computers are a good fit, and the ways to express those problems, are substantially different from the kinds of problems and expressions used in classical computing.
TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation
Our framework has several advantages over existing methods: (1) It is flexible and adaptable, because it can easily be extended to other downstream tasks that require code representation (such as code-clone detection and classification); (2) it is efficient and scalable, because it does not require a large model or a large amount of training data, and it can support any programming language; (3) it is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives; and (4) it can also adjust the number of encoder parameters based on computing resources.
Source Code Clone Detection Using Unsupervised Similarity Measures
Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation.
Investigating the Efficacy of Large Language Models for Code Clone Detection
GPT-based models are one of the popular ones studied for tasks such as code comment generation or test generation.
Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code
Therefore, auditing code developed using LLMs is challenging, as it is difficult to reliably assert if an LLM used during development has been trained on specific copyrighted codes, given that we do not have access to the training datasets of these models.
Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures
The capability of accurately determining code similarity is crucial in many tasks related to software development.