GENIE (GENeratIve Evaluation)

Introduced by Khashabi et al. in GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation

GENIE, which stands for GENeratIve Evaluation, is a system designed to standardize human evaluations across different text generation tasks. It was introduced to produce consistent evaluations that are reproducible over time and across different populations. The system is instantiated with datasets representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension. For each task, GENIE offers a leaderboard that automatically crowdsources annotations for submissions, evaluating them along axes such as correctness, conciseness, and fluency.

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

GENIE (GENeratIve Evaluation)

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

ARC-DA

Usage

License

Modalities

Languages

GENIE (GENeratIve Evaluation)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

ARC-DA

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages