Transformers

CodeBERT is a bimodal pre-trained model for programming language (PL) and natural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language code search, code documentation generation, etc. CodeBERT is developed with a Transformer-based neural architecture, and is trained with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables the utilization of both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators.

Source: CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Code Search 10 10.31%
Vulnerability Detection 8 8.25%
Code Generation 7 7.22%
Language Modelling 7 7.22%
Retrieval 4 4.12%
Language Modeling 3 3.09%
Large Language Model 3 3.09%
Code Translation 3 3.09%
Classification 3 3.09%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories