Text-to-Code Generation

8 papers with code • 1 benchmarks • 5 datasets

Text-to-Code Generation is a task where we can generate code based on the natural language description.

Source: Text-to-code Generation with TensorFlow, 🤗 & MBPP

Most implemented papers

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

salesforce/codet5 • • EMNLP 2021

We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

microsoft/CodeXGLUE • • 9 Feb 2021

Benchmark datasets have a significant impact on accelerating research in programming language tasks.

StructCoder: Structure-Aware Transformer for Code Generation

reddy-lab-code-research/structcoder • • 10 Jun 2022

This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description.

PanGu-Coder: Program Synthesis with Function-Level Language Modeling

MindSpore-paper-code-2/code399 • • 22 Jul 2022

We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.

C3PO: A Lightweight Copying Mechanism for Translating Pseudocode to Code

Pseudocode-to-Code/C3PO • • AACL-IJCNLP 2022

In the Copy Phase, a binary classifier is employed to determine and mask the pseudocode tokens that can be directly copied into the code.

Code Execution with Pre-trained Language Models

microsoft/CodeBERT • • 8 May 2023

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.

Guiding Language Models of Code with Global Context using Monitors

microsoft/monitors4codegen • 19 Jun 2023

We construct a repository-level dataset PragmaticCode for method-completion in Java and evaluate MGD on it.

Magicoder: Source Code Is All You Need

ise-uiuc/magicoder • • 4 Dec 2023

Magicoder models are trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets to generate high-quality instruction data for code.