Clone Detection

20 papers with code • 2 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Most implemented papers

UniASM: Binary Code Similarity Detection without Fine-tuning

clm07/uniasm 28 Oct 2022

Binary code similarity detection (BCSD) is widely used in various binary analysis tasks such as vulnerability search, malware detection, clone detection, and patch analysis.

Understanding Programs by Exploiting (Fuzzing) Test Cases

rabbitjy/fuzztuning 23 May 2023

The effectiveness of the proposed method is verified on two program understanding tasks including code clone detection and code classification, and it outperforms current state-of-the-arts by large margins.

Graph Neural Networks For Mapping Variables Between Programs -- Extended Version

pmorvalho/ecai23-gnns-for-mapping-variables-between-programs 24 Jul 2023

Typically, in order to compare two programs, a relation between both programs' sets of variables is required.

ZC3: Zero-Shot Cross-Language Code Clone Detection

lairikeqia/zc3 26 Aug 2023

In this paper, we propose a novel method named ZC3 for Zero-shot Cross-language Code Clone detection.

Cloning and Beyond: A Quantum Solution to Duplicate Code

SamyakJhaveri/Quantum-Graph-Subgraph-Isomorphism-Public Onward!: ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, Pages 32–49 2023

However, the kinds of problems for which these computers are a good fit, and the ways to express those problems, are substantially different from the kinds of problems and expressions used in classical computing.

TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation

iamfaith/transformcode 10 Nov 2023

Our framework has several advantages over existing methods: (1) It is flexible and adaptable, because it can easily be extended to other downstream tasks that require code representation (such as code-clone detection and classification); (2) it is efficient and scalable, because it does not require a large model or a large amount of training data, and it can support any programming language; (3) it is not limited to unsupervised learning, but can also be applied to some supervised learning tasks by incorporating task-specific labels or objectives; and (4) it can also adjust the number of encoder parameters based on computing resources.

Source Code Clone Detection Using Unsupervised Similarity Measures

jorge-martinez-gil/codesim 18 Jan 2024

Assessing similarity in source code has gained significant attention in recent years due to its importance in software engineering tasks such as clone detection and code search and recommendation.

Investigating the Efficacy of Large Language Models for Code Clone Detection

mkhfring/largelanguagemodels 24 Jan 2024

GPT-based models are one of the popular ones studied for tasks such as code comment generation or test generation.

Trained Without My Consent: Detecting Code Inclusion In Language Models Trained on Code

commissarsilver/trawic 14 Feb 2024

Therefore, auditing code developed using LLMs is challenging, as it is difficult to reliably assert if an LLM used during development has been trained on specific copyrighted codes, given that we do not have access to the training datasets of these models.

Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures

no code yet • 3 May 2024

The capability of accurately determining code similarity is crucial in many tasks related to software development.