DialSummEval

Introduced by Gao et al. in DialSummEval: Revisiting Summarization Evaluation for Dialogues

The DialSummEval is a multi-faceted dataset of human judgments. It was created to revisit the evaluation of dialogue summarization models. The dataset contains the outputs of 14 models on SAMSum, a dialogue summary dataset.

The creators of DialSummEval observed that current dialogue summarization models have flaws that may not be well exposed by frequently used metrics such as ROUGE. Therefore, they re-evaluated 18 categories of metrics in terms of four dimensions: coherence, consistency, fluency, and relevance. They also conducted a unified human evaluation of various models in dialogue summarization for the first time.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

AggreFact

SummEval

ConvoSumm

ECTSum

Usage

License

Unknown

DialSummEval

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

AggreFact

SummEval

ConvoSumm

ECTSum

Usage

License

Modalities

Languages

DialSummEval

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

AggreFact

SummEval

ConvoSumm

ECTSum

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages