Chinese Word Segmentation

48 papers with code • 6 benchmarks • 3 datasets

Chinese word segmentation is the task of splitting Chinese text (i.e. a sequence of Chinese characters) into words (Source: www.nlpprogress.com).

Most implemented papers

Exploring Segment Representations for Neural Segmentation Models

ExpResults/segrep-for-nn-semicrf 19 Apr 2016

Many natural language processing (NLP) tasks can be generalized into segmentation problem.

Neural Word Segmentation Learning for Chinese

jcyk/CWS ACL 2016

Most previous approaches to Chinese word segmentation formalize this problem as a character-based sequence labeling task where only contextual information within fixed sized local windows and simple interactions between adjacent tags can be captured.

Fast and Accurate Neural Word Segmentation for Chinese

jcyk/greedyCWS ACL 2017

Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation.

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation

chqiwang/convseg IJCNLP 2017

The first is that they heavily rely on manually designed bigram feature, i. e. they are not good at capturing n-gram features automatically.

Effective Neural Solution for Multi-Criteria Word Segmentation

hankcs/multi-criteria-cws 7 Dec 2017

We present a simple yet elegant solution to train a single joint model on multi-criteria corpora for Chinese Word Segmentation (CWS).

Dual Long Short-Term Memory Networks for Sub-Character Representation Learning

hankcs/sub-character-cws 23 Dec 2017

To build a concrete study and substantiate the efficiency of our neural architecture, we take Chinese Word Segmentation as a research case example.

Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text

adapt-sjtu/AMTTL COLING 2018

Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain.

State-of-the-art Chinese Word Segmentation with Bi-LSTMs

efeatikkan/Chinese_Word_Segmenter EMNLP 2018

A wide variety of neural-network architectures have been proposed for the task of Chinese word segmentation.