Coder Reviewer Reranking for Code Generation

Sampling diverse programs from a code language model and reranking with model likelihood is a popular method for code generation but it is prone to preferring degenerate solutions. Inspired by collaborative programming, we propose Coder-Reviewer reranking. We augment Coder language models from past work, which generate programs given language instructions, with Reviewer models, which evaluate the likelihood of the instruction given the generated programs. We perform an extensive study across six datasets with eight models from three model families. Experimental results show that Coder-Reviewer reranking leads to consistent and significant improvement (up to 17% absolute accuracy gain) over reranking with the Coder model only. When combined with executability filtering, Coder-Reviewer reranking can often outperform the minimum Bayes risk method. Coder-Reviewer reranking is easy to implement by prompting, can generalize to different programming languages, and works well with off-the-shelf hyperparameters.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Code Generation HumanEval code-davinci-002 175B (Reviewer) Pass@1 61.2 # 15
Code Generation HumanEval code-davinci-002 175B (Coder-Reviewer) Pass@1 56.7 # 18
Code Generation MBPP InCoder 6.7B + MBR-Exec Accuracy 26.7 # 78
Code Generation MBPP code-cushman-001 12B + MBR-Exec Accuracy 48.3 # 52
Code Generation MBPP code-davinci-002 175B + MBR-Exec Accuracy 63 # 28
Code Generation MBPP code-davinci-002 175B + Coder-Reviewer Accuracy 66.4 # 23
Code Generation MBPP code-davinci-002 175B + Reviewer Accuracy 66.9 # 22
Code Generation MBPP CodeGen 16B + Coder-Reviewer Accuracy 46.2 # 59
Code Generation MBPP InCoder 6.7B + Coder-Reviewer Accuracy 26.1 # 79
Code Generation MBPP CodeGen 16B + Reviewer Accuracy 44.1 # 64
Code Generation MBPP InCoder 6.7B + Reviewer Accuracy 24.4 # 80
Code Generation MBPP CodeGen 16B + MBR-Exec Accuracy 47.3 # 55

Methods


No methods listed for this paper. Add relevant methods here