PCC (Potsdam Commentary Corpus)

The Potsdam Commentary Corpus (PCC) is a corpus of 220 German newspaper commentaries (2.900 sentences, 44.000 tokens) taken from the online issues of the Märkische Allgemeine Zeitung (MAZ subcorpus) and Tagesspiegel (ProCon subcorpus) and is annotated with a range of different types of linguistic information.

The central subcorpus that we are making publicly available consists of 176 MAZ texts, which are annotated with

  • Sentence Syntax
  • Coreference
  • Discourse Structure (RST & PDTB)
  • Aboutness topics

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Creative Commons Attribution-NonCommercial-ShareAlike

Modalities


Languages