Code Generation

289 papers with code • 18 benchmarks • 38 datasets

Code Generation is an important field to predict explicit code or program structure from multimodal data sources such as incomplete code, programs in another programming language, natural language descriptions or execution examples. Code Generation tools can assist the development of automatic programming tools to improve programming productivity.

Source: Deep Learning for Source Code Modeling and Generation

Image source: Measuring Coding Challenge Competence With APPS


Use these libraries to find Code Generation models and implementations
4 papers
2 papers
See all 11 libraries.

Most implemented papers

LLaMA: Open and Efficient Foundation Language Models

facebookresearch/llama arXiv 2023

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Evaluating Large Language Models Trained on Code

openai/human-eval 7 Jul 2021

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

Llama 2: Open Foundation and Fine-Tuned Chat Models

facebookresearch/llama 18 Jul 2023

In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters.

pix2code: Generating Code from a Graphical User Interface Screenshot

tonybeltramelli/pix2code 22 May 2017

Transforming a graphical user interface screenshot created by a designer into computer code is a typical task conducted by a developer in order to build customized software, websites, and mobile applications.

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

pcyin/tranX ACL 2018

Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.

A Syntactic Neural Model for General-Purpose Code Generation

pcyin/NL2code ACL 2017

We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python.

A parallel corpus of Python functions and documentation strings for automated code documentation and code generation

Avmb/code-docstring-corpus IJCNLP 2017

Automated documentation of programming source code and automated code generation from natural language are challenging tasks of both practical and scientific interest.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation

salesforce/codet5 EMNLP 2021

We present CodeT5, a unified pre-trained encoder-decoder Transformer model that better leverages the code semantics conveyed from the developer-assigned identifiers.

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.