Contrastive Code Representation Learning

Recent work learns contextual representations of source code by reconstructing tokens from their context. For downstream semantic understanding tasks like summarizing code in English, these representations should ideally capture program functionality... (read more)

PDF Abstract

Datasets


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Method name prediction CodeSearchNet ContraCode F1 17.24 # 1
Source Code Summarization CodeSearchNet ContraCode F1 17.24 # 1
Type prediction DeepTyper ContraCode Accuracy@5 84.60 # 1

Methods used in the Paper


METHOD TYPE
Adam
Stochastic Optimization
Linear Warmup With Linear Decay
Learning Rate Schedules
Dropout
Regularization
Multi-Head Attention
Attention Modules
GELU
Activation Functions
Layer Normalization
Normalization
Residual Connection
Skip Connections
Scaled Dot-Product Attention
Attention Mechanisms
Attention Dropout
Regularization
Weight Decay
Regularization
Softmax
Output Functions
Dense Connections
Feedforward Networks
WordPiece
Subword Segmentation
BERT
Language Models
RoBERTa
Transformers