The MPQA Opinion Corpus contains 535 news articles from a wide variety of news sources manually annotated for opinions and other private states (i.e., beliefs, emotions, sentiments, speculations, etc.).
281 PAPERS • 3 BENCHMARKS
KPTimes is a large-scale dataset of news texts paired with editor-curated keyphrases.
20 PAPERS • 3 BENCHMARKS
Paper: Improved automatic keyword extraction given more linguistic knowledge Doi: 10.3115/1119355.1119383
4 PAPERS • 4 BENCHMARKS
We present CSL, a large-scale Chinese Scientific Literature dataset, which contains the titles, abstracts, keywords and academic fields of 396,209 papers. To our knowledge, CSL is the first scientific document dataset in Chinese.
1 PAPER • NO BENCHMARKS YET