Discourse Coherence in the Wild: A Dataset, Evaluation and Methods

WS 2018  ·  Alice Lai, Joel Tetreault ·

To date there has been very little work on assessing discourse coherence methods on real-world data. To address this, we present a new corpus of real-world texts (GCDC) as well as the first large-scale evaluation of leading discourse coherence algorithms. We show that neural models, including two that we introduce here (SentAvg and ParSeq), tend to perform best. We analyze these performance differences and discuss patterns we observed in low coherence texts in four domains.

PDF Abstract WS 2018 PDF WS 2018 Abstract

Datasets


Introduced in the Paper:

GCDC
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Coherence Evaluation GCDC + RST - Accuracy ParSeq Accuracy 55.09 # 3
Coherence Evaluation GCDC + RST - F1 ParSeq Average F1 46.65 # 2

Methods


No methods listed for this paper. Add relevant methods here