Robust Summarization Evaluation Benchmark

Introduced by Liu et al. in Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Robust Summarization Evaluation Benchmark is a large human evaluation dataset consisting of over 22k summary-level annotations over state-of-the-art systems on three datasets.

Source: Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages