3 dataset results for Named Entity Recognition AND English

CoNLL 2003

CoNLL-2003 is a named entity recognition dataset released as a part of CoNLL-2003 shared task: language-independent named entity recognition. The data consists of eight files covering two languages: English and German. For each of the languages there is a training file, a development file, a test file and a large file with unannotated data.

639 PAPERS • 16 BENCHMARKS

SciERC

SciERC dataset is a collection of 500 scientific abstract annotated with scientific entities, their relations, and coreference clusters. The abstracts are taken from 12 AI conference/workshop proceedings in four AI communities, from the Semantic Scholar Corpus. SciERC extends previous datasets in scientific articles SemEval 2017 Task 10 and SemEval 2018 Task 7 by extending entity types, relation types, relation coverage, and adding cross-sentence relations using coreference links.

120 PAPERS • 7 BENCHMARKS

NuNER

The dataset used to pre-train NuNER from the NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data

1 PAPER • NO BENCHMARKS YET

Datasets

3 dataset results for Named Entity Recognition AND English