SummEval

Introduced by Fabbri et al. in SummEval: Re-evaluating Summarization Evaluation

The SummEval dataset is a resource developed by the Yale LILY Lab and Salesforce Research for evaluating text summarization models. It was created as part of a project to address shortcomings in summarization evaluation methods.

The dataset includes summaries generated by various recent summarization models trained on the CNN/DailyMail dataset. It also contains human annotations, collected from both crowdsource workers and experts. However, the source articles used to generate the summaries are not included.

The SummEval project also provides a toolkit for summarization evaluation. This toolkit unifies metrics and promotes robust comparison of summarization systems. It contains popular and recent metrics for summarization as well as several machine translation metrics.

The goal of the SummEval project is to promote a more complete evaluation protocol for text summarization and advance research in developing evaluation metrics that better correlate with human judgments.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

SummEval

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

CNN/Daily Mail

OpenMEVA

DialFact

Usage

License

Modalities

Languages

SummEval

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

CNN/Daily Mail

OpenMEVA

DialFact

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages