Chinese Word Segmentation

48 papers with code • 6 benchmarks • 3 datasets

Chinese word segmentation is the task of splitting Chinese text (i.e. a sequence of Chinese characters) into words (Source: www.nlpprogress.com).

Benchmarks

Add a Result

These leaderboards are used to track progress in Chinese Word Segmentation

Dataset	Best Model	Compare
MSR	BABERT-LE	See all
PKU	BABERT-LE	See all
CTB6	LATTE (Linguistic units, lattices, PTMs, GNNs)	See all
MSRA	BABERT-LE	See all
CITYU	WMSeg + ZEN	See all
AS	Glyce + BERT	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Exploring Segment Representations for Neural Segmentation Models

ExpResults/segrep-for-nn-semicrf • 19 Apr 2016

Many natural language processing (NLP) tasks can be generalized into segmentation problem.

Paper
Code

A Segmentation Matrix Method for Chinese Segmentation Ambiguity Analysis

YPench/SMatrix • ROCLINGIJCLCLP 2016

Paper
Code

Neural Word Segmentation Learning for Chinese

jcyk/CWS • ACL 2016

Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task where only contextual information within fixed sized local windows and simple interactions between adjacent tags can be captured.

Paper
Code

Transition-Based Neural Word Segmentation

SUTDNLP/NNTransitionSegmentor • ACL 2016

Paper
Code

Fast and Accurate Neural Word Segmentation for Chinese

jcyk/greedyCWS • ACL 2017

Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation.

Paper
Code

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

chqiwang/convseg • • IJCNLP 2017

The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.

Paper
Code

Effective Neural Solution for Multi-Criteria Word Segmentation

hankcs/multi-criteria-cws • 7 Dec 2017

We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS).

Paper
Code

Dual Long Short-Term Memory Networks for Sub-Character Representation Learning

hankcs/sub-character-cws • 23 Dec 2017

To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example.

Paper
Code

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text

adapt-sjtu/AMTTL • • COLING 2018

Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain.

Paper
Code

State-of-the-art Chinese Word Segmentation with Bi-LSTMs

efeatikkan/Chinese_Word_Segmenter • • EMNLP 2018

A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation.

Paper
Code

Chinese Word Segmentation

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result