About

Chinese word segmentation is the task of splitting Chinese text (i.e. a sequence of Chinese characters) into words (Source: www.nlpprogress.com).

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Greatest papers with code

PKUSEG: A Toolkit for Multi-Domain Chinese Word Segmentation

27 Jun 2019lancopku/pkuseg-python

Chinese word segmentation (CWS) is a fundamental step of Chinese natural language processing.

CHINESE WORD SEGMENTATION

N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models

24 Sep 2020HIT-SCIR/ltp

We introduce N-LTP, an open-source Python Chinese natural language processing toolkit supporting five basic tasks: Chinese word segmentation, part-of-speech tagging, named entity recognition, dependency parsing, and semantic dependency parsing.

CHINESE WORD SEGMENTATION DEPENDENCY PARSING KNOWLEDGE DISTILLATION NAMED ENTITY RECOGNITION PART-OF-SPEECH TAGGING SEMANTIC DEPENDENCY PARSING

ZEN: Pre-training Chinese Text Encoder Enhanced by N-gram Representations

2 Nov 2019sinovation/ZEN

Moreover, it is shown that reasonable performance can be obtained when ZEN is trained on a small corpus, which is important for applying pre-training techniques to scenarios with limited data.

CHINESE NAMED ENTITY RECOGNITION CHINESE WORD SEGMENTATION DOCUMENT CLASSIFICATION NATURAL LANGUAGE INFERENCE PART-OF-SPEECH TAGGING SENTENCE PAIR MODELING SENTIMENT ANALYSIS

fastHan: A BERT-based Joint Many-Task Toolkit for Chinese NLP

18 Sep 2020fastnlp/fastHan

The kernel of fastHan is a joint many-task model based on a pruned BERT, which uses the first 8 layers in BERT.

4 CHINESE WORD SEGMENTATION DEPENDENCY PARSING NAMED ENTITY RECOGNITION PART-OF-SPEECH TAGGING

Effective Neural Solution for Multi-Criteria Word Segmentation

7 Dec 2017hankcs/multi-criteria-cws

We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS).

CHINESE WORD SEGMENTATION

RethinkCWS: Is Chinese Word Segmentation a Solved Task?

EMNLP 2020 neulab/InterpretEval

The performance of the Chinese Word Segmentation (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks, especially the successful use of large pre-trained models.

CHINESE WORD SEGMENTATION

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

IJCNLP 2017 chqiwang/convseg

The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.

CHINESE WORD SEGMENTATION FEATURE ENGINEERING WORD EMBEDDINGS

Improving Chinese Word Segmentation with Wordhood Memory Networks

ACL 2020 SVAIGBA/WMSeg

Contextual features always play an important role in Chinese word segmentation (CWS).

CHINESE WORD SEGMENTATION