Cause and Effect in Governmental Reports: Two Data Sets for Causality Detection in Swedish

PoliticalNLP (LREC) 2022 · Luise Dürlich, Sebastian Reimann, Gustav Finnveden, Joakim Nivre, Sara Stymne ·

Causality detection is the task of extracting information about causal relations from text. It is an important task for different types of document analysis, including political impact assessment. We present two new data sets for causality detection in Swedish. The first data set is annotated with binary relevance judgments, indicating whether a sentence contains causality information or not. In the second data set, sentence pairs are ranked for relevance with respect to a causality query, containing a specific hypothesized cause and/or effect. Both data sets are carefully curated and mainly intended for use as test data. We describe the data sets and their annotation, including detailed annotation guidelines. In addition, we present pilot experiments on cross-lingual zero-shot and few-shot causality detection, using training data from English and German.

PDF Abstract