530 papers with code • 5 benchmarks • 22 datasets
The named entity recognition (NER) involves identification of key information in the text and classification into a set of predefined categories. This includes standard entities in the text like Part of Speech (PoS) and entities like places, names etc...
The pre-trained language models have achieved great successes in various natural language understanding (NLU) tasks due to its capacity to capture the deep contextualized information in text by pre-training on large-scale corpora.
Today when many practitioners run basic NLP on the entire web and large-volume traffic, faster methods are paramount to saving time and energy costs.
We test the practical impacts of the deficiency on real-world NER datasets, OntoNotes 5. 0 and WNUT 2017, with clear and consistent improvements over the baseline, up to 8. 7% on some of the multi-token entity mentions.
In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task.
In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese.
Recently, with the surge of transformers based models, language-specific BERT based models have proven to be very efficient at language understanding, provided they are pre-trained on a very large corpus.