Chinese Named Entity Recognition

32 papers with code • 7 benchmarks • 5 datasets

Chinese named entity recognition is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. from Chinese text (Source: Adapted from Wikipedia).


Use these libraries to find Chinese Named Entity Recognition models and implementations
2 papers

Most implemented papers

ERNIE: Enhanced Representation through Knowledge Integration

PaddlePaddle/PaddleNLP 19 Apr 2019

We present a novel language representation model enhanced by knowledge called ERNIE (Enhanced Representation through kNowledge IntEgration).

ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations

sinovation/ZEN Findings of the Association for Computational Linguistics 2020

Moreover, it is shown that reasonable performance can be obtained when ZEN is trained on a small corpus, which is important for applying pre-training techniques to scenarios with limited data.

A Unified MRC Framework for Named Entity Recognition

ShannonAI/mrc-for-flat-nested-ner ACL 2020

Instead of treating the task of NER as a sequence labeling problem, we propose to formulate it as a machine reading comprehension (MRC) task.

TENER: Adapting Transformer Encoder for Named Entity Recognition

fastnlp/TENER 10 Nov 2019

The Bidirectional long short-term memory networks (BiLSTM) have been widely used as an encoder in models solving the named entity recognition (NER) task.

Chinese NER Using Lattice LSTM

jiesutd/LatticeLSTM ACL 2018

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon.

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

PaddlePaddle/ERNIE 29 Jul 2019

Recently, pre-trained models have achieved state-of-the-art results in various language understanding tasks, which indicates that pre-training on large-scale corpora may play a crucial role in natural language processing.

CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese

CLUEbenchmark/CLUENER2020 13 Jan 2020

In this paper, we introduce the NER dataset from CLUE organization (CLUENER2020), a well-defined fine-grained dataset for named entity recognition in Chinese.

Glyce: Glyph-vectors for Chinese Character Representations

ShannonAI/glyce NeurIPS 2019

However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.

Simplify the Usage of Lexicon in Chinese NER

v-mipeng/LexiconAugmentedNER ACL 2020

This method avoids designing a complicated sequence modeling architecture, and for any neural NER model, it requires only subtle adjustment of the character representation layer to introduce the lexicon information.

Dice Loss for Data-imbalanced NLP Tasks

ShannonAI/dice_loss_for_NLP ACL 2020

Many NLP tasks such as tagging and machine reading comprehension are faced with the severe data imbalance issue: negative examples significantly outnumber positive examples, and the huge number of background examples (or easy-negative examples) overwhelms the training.