The MPQA Opinion Corpus contains 535 news articles from a wide variety of news sources manually annotated for opinions and other private states (i.e., beliefs, emotions, sentiments, speculations, etc.).
301 PAPERS • 3 BENCHMARKS
We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials. Although this was a new task, we had a total of 26 submissions across 3 evaluation scenarios. We expect the task and the findings reported in this paper to be relevant for researchers working on understanding scientific content, as well as the broader knowledge base population and information extraction communities.
33 PAPERS • 2 BENCHMARKS
KPTimes is a large-scale dataset of news texts paired with editor-curated keyphrases.
25 PAPERS • 3 BENCHMARKS
Paper: Improved automatic keyword extraction given more linguistic knowledge Doi: 10.3115/1119355.1119383
6 PAPERS • 2 BENCHMARKS
We present CSL, a large-scale Chinese Scientific Literature dataset, which contains the titles, abstracts, keywords and academic fields of 396,209 papers. To our knowledge, CSL is the first scientific document dataset in Chinese.
1 PAPER • NO BENCHMARKS YET