Code Classification

10 papers with code • 0 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

CodeS: Towards Code Model Generalization Under Distribution Shift

testing-cs/codes-distribution-shift-benchmark-datasets 11 Jun 2022

Distribution shift has been a longstanding challenge for the reliable deployment of deep learning (DL) models due to unexpected accuracy degradation.

SCC: Automatic Classification of Code Snippets

mindscan-de/FluentGenesis-Classifier 21 Sep 2018

Determining the programming language of a source code file has been considered in the research community; it has been shown that Machine Learning (ML) and Natural Language Processing (NLP) algorithms can be effective in identifying the programming language of source code files.

Embedding Java Classes with code2vec: Improvements from Variable Obfuscation

basedrhys/obfuscated-code2vec 6 Apr 2020

code2vec is a recently released embedding approach that uses the proxy task of method name prediction to map Java methods to feature vectors.

CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks

IBM/Project_CodeNet 25 May 2021

In addition to its large scale, CodeNet has a rich set of high-quality annotations to benchmark and help accelerate research in AI techniques for a variety of critical coding tasks, including code similarity and classification, code translation between a large variety of programming languages, and code performance (runtime and memory) improvement techniques.

Semantic Code Classification for Automated Machine Learning

whatevernevermindbro/nl2ml-mirror 25 Jan 2022

A range of applications for automatic machine learning need the generation process to be controllable.

Learning Program Semantics with Code Representations: An Empirical Study

jingkai92/learning-program-representation 22 Mar 2022

However, currently, a comprehensive and systematic study on evaluating different program representation techniques across diverse tasks is still missed.

MIXCODE: Enhancing Code Classification by Mixup-Based Data Augmentation

zemingd/mixup4code 6 Oct 2022

Data augmentation has been a popular approach to supplement training data in domains such as computer vision and NLP.

Heterogeneous Directed Hypergraph Neural Network over abstract syntax tree (AST) for Code Classification

qiankunmu/hdhgn 7 May 2023

In this study, we propose to represent AST as a heterogeneous directed hypergraph (HDHG) and process the graph by heterogeneous directed hypergraph neural network (HDHGN) for code classification.

The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification

record/7608802 8 May 2023

These findings show that early layers can be used to obtain better results using the same resources, as well as to reduce resource usage during fine-tuning and inference.

Understanding Programs by Exploiting (Fuzzing) Test Cases

rabbitjy/fuzztuning 23 May 2023

The effectiveness of the proposed method is verified on two program understanding tasks including code clone detection and code classification, and it outperforms current state-of-the-arts by large margins.