Syntax-Aware Fill-in-the-Middle (SAFIM) is a benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task. SAFIM has three subtasks: Algorithmic Block Completion, Control-Flow Expression Completion, and API Function Call Completion. SAFIM is sourced from code submitted from April 2022 to January 2023 to minimize the impact of data contamination on evaluation results.
The SAFIM benchmark is partially derived from problem descriptions and code solutions from https://codeforces.com. According to the license of CodeForces, you may publish the texts of Codeforces problems in any open sources, but you must preserve a direct link to the site.
Paper | Code | Results | Date | Stars |
---|