Text-to-Code Generation
10 papers with code • 1 benchmarks • 8 datasets
Text-to-Code Generation is a task where we can generate code based on the natural language description.
Source: Text-to-code Generation with TensorFlow, 🤗 & MBPP
Most implemented papers
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
Benchmark datasets have a significant impact on accelerating research in programming language tasks.
Magicoder: Empowering Code Generation with OSS-Instruct
Magicoder models are trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets to generate diverse instruction data for code.
StructCoder: Structure-Aware Transformer for Code Generation
This paper addresses the problem of code generation, where the goal is to generate target code given source code in a different language or a natural language description.
PanGu-Coder: Program Synthesis with Function-Level Language Modeling
We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i. e. the synthesis of programming language solutions given a natural language problem description.
C3PO: A Lightweight Copying Mechanism for Translating Pseudocode to Code
In the Copy Phase, a binary classifier is employed to determine and mask the pseudocode tokens that can be directly copied into the code.
Code Execution with Pre-trained Language Models
Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.
Guiding Language Models of Code with Global Context using Monitors
We construct a repository-level dataset PragmaticCode for method-completion in Java and evaluate MGD on it.
Can Large Language Models Solve Robot Routing?
We systematically investigate the performance of LLMs in robot routing by constructing a dataset with 80 unique robot routing problems across 8 variants in both single and multi-robot settings.
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct
Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain.