Code Generation

337 papers with code • 17 benchmarks • 43 datasets

Code Generation is an important field to predict explicit code or program structure from multimodal data sources such as incomplete code, programs in another programming language, natural language descriptions or execution examples. Code Generation tools can assist the development of automatic programming tools to improve programming productivity.

Source: Deep Learning for Source Code Modeling and Generation

Image source: Measuring Coding Challenge Competence With APPS

Benchmarks

Add a Result

These leaderboards are used to track progress in Code Generation

Dataset	Best Model	Compare
HumanEval	AgentCoder (GPT-4)	See all
MBPP	GPT-4 + AgentCoder	See all
CoNaLa	PanGu-Coder-FT-I	See all
APPS	CodeRL+CodeT5	See all
Django	MarianCG	See all
WikiSQL	NL2SQL-RULE	See all
PECC	Claude 3 Haiku	See all
CoNaLa-Ext	BART Base	See all
Turbulence	GPT-4	See all
CodeContests	AlphaCode 41B + clustering	See all
Shellcode_IA32	CodeBERT	See all
TACO-Code	GPT-4	See all
CodeXGLUE - CodeSearchNet	Redcoder-ext	See all
CONCODE	Redcoder-ext	See all
Verified Smart Contract Code Comments	GPT-J 6B Smart Contract	See all
Android Repos	Entity Type Model	See all
DSEval-Kaggle	CoML	See all

Show all 17 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Code Generation models and implementations

epfllm/megatron-llm

4 papers

469

eth-sri/sven

4 papers

uiuc-focal-lab/syncode

3 papers

DeepLearnXMU/CG-RL

3 papers

See all 15 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Programming Puzzles

microsoft/PythonProgrammingPuzzles • 10 Jun 2021

The dataset is comprehensive in that it spans problems of a range of difficulties and domains, ranging from trivial string manipulation problems, to classic programming puzzles (e. g., Tower of Hanoi), to interview/competitive-programming problems (e. g., dynamic programming), to longstanding open problems in algorithms and mathematics (e. g., factoring).

Paper
Code

Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions

xqx12/daily-info • • 20 Aug 2021

The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code.

Paper
Code

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

anthropics/hh-rlhf • 12 Apr 2022

We apply preference modeling and reinforcement learning from human feedback (RLHF) to finetune language models to act as helpful and harmless assistants.

Paper
Code

InCoder: A Generative Model for Code Infilling and Synthesis

dpfried/incoder • • 12 Apr 2022

Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.

Paper
Code

AskIt: Unified Programming Interface for Programming with Large Language Models

katsumiok/pyaskit • 29 Aug 2023

Developers face decisions regarding the use of LLMs for directly performing tasks within applications as well as for generating and executing code to accomplish these tasks.

Paper
Code

Mixtral of Experts

hit-scir/chinese-mixtral-8x7b • • 8 Jan 2024

In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.

Paper
Code

Latent Predictor Networks for Code Generation

deepmind/card2code • ACL 2016

Many language generation tasks require the production of text conditioned on both structured and unstructured inputs.

Paper
Code

Bidirectional Attention for SQL Generation

guotong1988/NL2SQL • • 30 Dec 2017

Generating structural query language (SQL) queries from natural language is a long-standing open problem.

Paper
Code

Building Language Models for Text with Named Entities

uclanlp/NamedEntityLanguageModel • • ACL 2018

Text in many domains involves a significant amount of named entities.

Paper
Code

Structural Language Models of Code

tech-srl/slm-code-generation • • ICML 2020

We introduce a new approach to any-code completion that leverages the strict syntax of programming languages to model a code snippet as a tree - structural language modeling (SLM).

Paper
Code

Code Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result