TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Generation	HumanEval	CodeGen-Mono 350M (zero-shot)	Pass@1	12.8	# 116
Code Generation	HumanEval	CodeGen-Mono 16B (zero-shot)	Pass@1	29.3	# 81
Code Generation	HumanEval	CodeGen-Mono 16B (zero-shot)	Pass@10	49.9	# 1
Code Generation	HumanEval	CodeGen-Mono 16B (zero-shot)	Pass@100	75	# 1
Code Generation	HumanEval	CodeGen-Mono 6.1B (zero-shot)	Pass@1	26.1	# 89
Code Generation	HumanEval	CodeGen-Mono 2.7B (zero-shot)	Pass@1	23.7	# 91
Code Generation	HumanEval	CodeGen-NL 16B (zero-shot)	Pass@1	14.2	# 113
Code Generation	HumanEval	CodeGen-Multi 16B (zero-shot)	Pass@1	18.3	# 101
Code Generation	HumanEval	CodeGen-Multi 6.1B (zero-shot)	Pass@1	18.2	# 103
Code Generation	HumanEval	CodeGen-Multi 2.7B (zero-shot)	Pass@1	14.5	# 112
Code Generation	HumanEval	CodeGen-Multi 350M (zero-shot)	Pass@1	6.7	# 124
Code Generation	HumanEval	CodeGen-NL 6.1B (zero-shot)	Pass@1	10.4	# 121
Code Generation	HumanEval	CodeGen-NL 2.7B (zero-shot)	Pass@1	6.7	# 124

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-conversational-paradigm-for-program/code-generation-on-humaneval)](https://paperswithcode.com/sota/code-generation-on-humaneval?p=a-conversational-paradigm-for-program)`

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

25 Mar 2022 · Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong ·

Program synthesis strives to generate a computer program as a solution to a given problem specification, expressed with input-output examples or natural language descriptions. The prevalence of large language models advances the state-of-the-art for program synthesis, though limited training resources and data impede open access to such models. To democratize this, we train and release a family of large language models up to 16.1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER. We show the utility of the trained model by demonstrating that it is competitive with the previous state-of-the-art on zero-shot Python code generation on HumanEval. We further investigate the multi-step paradigm for program synthesis, where a single program is factorized into multiple prompts specifying subproblems. To this end, we construct an open benchmark, Multi-Turn Programming Benchmark (MTPB), consisting of 115 diverse problem sets that are factorized into multi-turn prompts. Our analysis on MTPB shows that the same intent provided to CODEGEN in multi-turn fashion significantly improves program synthesis over that provided as a single turn. We make the training library JAXFORMER and model checkpoints available as open source contribution: https://github.com/salesforce/CodeGen.

PDF Abstract

Code

Add Remove Mark official

salesforce/CodeGen official

4,762

salesforce/jaxformer official

260

openlmlab/moss

11,809

eth-sri/sven

Mind23-2/MindCode-134

Tasks

Add Remove

Code Generation

Language Modelling

Large Language Model

Program Synthesis

Datasets

HumanEval

The Pile

Results from the Paper

Add Remove

Ranked #81 on Code Generation on HumanEval

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Generation	HumanEval	CodeGen-Mono 350M (zero-shot)	Pass@1	12.8	# 116	Compare
Code Generation	HumanEval	CodeGen-Mono 16B (zero-shot)	Pass@1	29.3	# 81	Compare
			Pass@10	49.9	# 1	Compare
			Pass@100	75	# 1	Compare
Code Generation	HumanEval	CodeGen-Mono 6.1B (zero-shot)	Pass@1	26.1	# 89	Compare
Code Generation	HumanEval	CodeGen-Mono 2.7B (zero-shot)	Pass@1	23.7	# 91	Compare
Code Generation	HumanEval	CodeGen-NL 16B (zero-shot)	Pass@1	14.2	# 113	Compare
Code Generation	HumanEval	CodeGen-Multi 16B (zero-shot)	Pass@1	18.3	# 101	Compare
Code Generation	HumanEval	CodeGen-Multi 6.1B (zero-shot)	Pass@1	18.2	# 103	Compare
Code Generation	HumanEval	CodeGen-Multi 2.7B (zero-shot)	Pass@1	14.5	# 112	Compare
Code Generation	HumanEval	CodeGen-Multi 350M (zero-shot)	Pass@1	6.7	# 124	Compare
Code Generation	HumanEval	CodeGen-NL 6.1B (zero-shot)	Pass@1	10.4	# 121	Compare
Code Generation	HumanEval	CodeGen-NL 2.7B (zero-shot)	Pass@1	6.7	# 124	Compare

Methods

Add Remove

CodeGen

Edit Social Preview

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove