Code Translation
16 papers with code • 2 benchmarks • 4 datasets
Libraries
Use these libraries to find Code Translation models and implementationsMost implemented papers
Unsupervised Translation of Programming Languages
We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Empirically, we show that composed fine-tuning improves over standard fine-tuning on two pseudocode-to-code translation datasets (3% and 6% relative).
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages
Recent advances in self-supervised learning have dramatically improved the state of the art on a wide variety of tasks.
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.
GraphCodeBERT: Pre-training Code Representations with Data Flow
Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.
Unified Pre-training for Program Understanding and Generation
Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.
CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks
In addition to its large scale, CodeNet has a rich set of high-quality annotations to benchmark and help accelerate research in AI techniques for a variety of critical coding tasks, including code similarity and classification, code translation between a large variety of programming languages, and code performance (runtime and memory) improvement techniques.
Learning C to x86 Translation: An Experiment in Neural Compilation
Deep learning has had a significant impact on many fields.
Leveraging Automated Unit Tests for Unsupervised Code Translation
With little to no parallel data available for programming languages, unsupervised methods are well-suited to source code translation.