RepoEval is a benchmark specifically designed for evaluating repository-level code auto-completion systems. While existing benchmarks mainly focus on single-file tasks, RepoEval addresses the assessment gap for more complex, real-world, multi-file programming scenarios. Here are the key details about RepoEval:

  1. Tasks:

    • RepoBench-R (Retrieval): Measures the system's ability to retrieve the most relevant code snippets from other files as cross-file context.
    • RepoBench-C (Code Completion): Evaluates the system's capability to predict the next line of code with cross-file and in-file context.
    • RepoBench-P (Pipeline): Handles complex tasks that require a combination of both retrieval and next-line prediction¹.
  2. Languages Supported:

    • RepoEval supports both Python and Java¹.
  3. Purpose:

    • RepoEval aims to facilitate a more complete comparison of performance and encourage continuous improvement in auto-completion systems¹.
  4. Availability:

    • RepoEval is publicly available for use here ¹.

In summary, RepoEval provides a comprehensive evaluation framework for assessing the effectiveness of repository-level code auto-completion systems, enabling researchers and developers to enhance code productivity and quality.

(1) [2306.03091] RepoBench: Benchmarking Repository-Level Code Auto .... https://arxiv.org/abs/2306.03091. (2) [2303.12570] RepoCoder: Repository-Level Code Completion Through .... https://arxiv.org/abs/2303.12570. (3) [2306.03091] RepoBench: Benchmarking Repository-Level Code Auto .... https://ar5iv.labs.arxiv.org/html/2306.03091. (4) GitHub - Leolty/repobench: RepoBench: Benchmarking Repository-Level .... https://github.com/Leolty/repobench. (5) undefined. https://doi.org/10.48550/arXiv.2306.03091.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages