TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Code Generation	CodeContests	WizardCoder-15B	Test Set pass@1	1.11	# 3
Code Generation	CodeContests	WizardCoder-15B	Test Set pass@5	3.18	# 3
Code Generation	CodeContests	WizardCoder-15B	Val Set pass@1	1.98	# 3
Code Generation	CodeContests	WizardCoder-15B	Val Set pass@5	3.27	# 3
Code Generation	HumanEval	WizardCoder 15B	Pass@1	57.30	# 35
Code Generation	MBPP	WizardCoder 15B	Accuracy	51.8	# 45

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/wizardcoder-empowering-code-large-language/code-generation-on-codecontests)](https://paperswithcode.com/sota/code-generation-on-codecontests?p=wizardcoder-empowering-code-large-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/wizardcoder-empowering-code-large-language/code-generation-on-humaneval)](https://paperswithcode.com/sota/code-generation-on-humaneval?p=wizardcoder-empowering-code-large-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/wizardcoder-empowering-code-large-language/code-generation-on-mbpp)](https://paperswithcode.com/sota/code-generation-on-mbpp?p=wizardcoder-empowering-code-large-language)`

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

14 Jun 2023 · Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang ·

Code Large Language Models (Code LLMs), such as StarCoder, have demonstrated exceptional performance in code-related tasks. However, most existing models are solely pre-trained on extensive raw code data without instruction fine-tuning. In this paper, we introduce WizardCoder, which empowers Code LLMs with complex instruction fine-tuning, by adapting the Evol-Instruct method to the domain of code. Through comprehensive experiments on four prominent code generation benchmarks, namely HumanEval, HumanEval+, MBPP, and DS-1000, we unveil the exceptional capabilities of our model. It surpasses all other open-source Code LLMs by a substantial margin. Moreover, our model even outperforms the largest closed LLMs, Anthropic's Claude and Google's Bard, on HumanEval and HumanEval+. Our code, model weights, and data are public at https://github.com/nlpxucan/WizardLM

PDF Abstract

Code

Add Remove Mark official

nlpxucan/wizardlm official

8,915

nickrosh/evol-teacher

131

Tasks

Add Remove

Code Generation

Datasets

HumanEval MBPP CodeContests

DS-1000

Results from the Paper

Add Remove

Ranked #3 on Code Generation on CodeContests (Test Set pass@1 metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Code Generation	CodeContests	WizardCoder-15B	Test Set pass@1	1.11	# 3	Compare
			Test Set pass@5	3.18	# 3	Compare
			Val Set pass@1	1.98	# 3	Compare
			Val Set pass@5	3.27	# 3	Compare
Code Generation	HumanEval	WizardCoder 15B	Pass@1	57.30	# 35	Compare
Code Generation	MBPP	WizardCoder 15B	Accuracy	51.8	# 45	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove