TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Generation	APPS	CodeChain+WizardCoder-15b	Introductory Pass@1	26.29	# 4
Code Generation	APPS	CodeChain+WizardCoder-15b	Interview Pass@1	7.49	# 3
Code Generation	APPS	CodeChain+WizardCoder-15b	Competition Pass@1	3.75	# 3
Code Generation	APPS	WizardCoder-15b	Introductory Pass@1	26.04	# 5
Code Generation	APPS	WizardCoder-15b	Interview Pass@1	4.21	# 5
Code Generation	APPS	WizardCoder-15b	Competition Pass@1	0.81	# 5
Code Generation	CodeContests	CodeChain + WizardCoder-15B	Test Set pass@1	2.35	# 2
Code Generation	CodeContests	CodeChain + WizardCoder-15B	Test Set pass@5	3.29	# 2
Code Generation	CodeContests	CodeChain + WizardCoder-15B	Val Set pass@1	2.48	# 2
Code Generation	CodeContests	CodeChain + WizardCoder-15B	Val Set pass@5	3.30	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codechain-towards-modular-code-generation/code-generation-on-codecontests)](https://paperswithcode.com/sota/code-generation-on-codecontests?p=codechain-towards-modular-code-generation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/codechain-towards-modular-code-generation/code-generation-on-apps)](https://paperswithcode.com/sota/code-generation-on-apps?p=codechain-towards-modular-code-generation)`

CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

13 Oct 2023 · Hung Le, Hailin Chen, Amrita Saha, Akash Gokul, Doyen Sahoo, Shafiq Joty ·

Large Language Models (LLMs) have already become quite proficient at solving simpler programming tasks like those in HumanEval or MBPP benchmarks. However, solving more complex and competitive programming tasks is still quite challenging for these models - possibly due to their tendency to generate solutions as monolithic code blocks instead of decomposing them into logical sub-tasks and sub-modules. On the other hand, experienced programmers instinctively write modularized code with abstraction for solving complex tasks, often reusing previously developed modules. To address this gap, we propose CodeChain, a novel framework for inference that elicits modularized code generation through a chain of self-revisions, each being guided by some representative sub-modules generated in previous iterations. Concretely, CodeChain first instructs the LLM to generate modularized codes through chain-of-thought prompting. Then it applies a chain of self-revisions by iterating the two steps: 1) extracting and clustering the generated sub-modules and selecting the cluster representatives as the more generic and re-usable implementations, and 2) augmenting the original chain-of-thought prompt with these selected module-implementations and instructing the LLM to re-generate new modularized solutions. We find that by naturally encouraging the LLM to reuse the previously developed and verified sub-modules, CodeChain can significantly boost both modularity as well as correctness of the generated solutions, achieving relative pass@1 improvements of 35% on APPS and 76% on CodeContests. It is shown to be effective on both OpenAI LLMs as well as open-sourced LLMs like WizardCoder. We also conduct comprehensive ablation studies with different methods of prompting, number of clusters, model sizes, program qualities, etc., to provide useful insights that underpin CodeChain's success.

PDF Abstract

Code

Add Remove Mark official

SalesforceAIResearch/CodeChain official

Tasks

Add Remove

Code Generation

Datasets

APPS CodeContests

Results from the Paper

Edit

Ranked #2 on Code Generation on CodeContests (Test Set pass@1 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Generation	APPS	CodeChain+WizardCoder-15b	Introductory Pass@1	26.29	# 4	Compare
			Interview Pass@1	7.49	# 3	Compare
			Competition Pass@1	3.75	# 3	Compare
Code Generation	APPS	WizardCoder-15b	Introductory Pass@1	26.04	# 5	Compare
			Interview Pass@1	4.21	# 5	Compare
			Competition Pass@1	0.81	# 5	Compare
Code Generation	CodeContests	CodeChain + WizardCoder-15B	Test Set pass@1	2.35	# 2	Compare
			Test Set pass@5	3.29	# 2	Compare
			Val Set pass@1	2.48	# 2	Compare
			Val Set pass@5	3.30	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove