4 papers with code • 2 benchmarks • 0 datasets
Thai word segmentation
We propose ThaiLMCut, a semi-supervised approach for Thai word segmentation which utilizes a bi-directional character language model (LM) as a way to leverage useful linguistic knowledge from unlabeled data.
Ranked #3 on Thai Word Segmentation on BEST-2010
As a test-bed, the well-known bidirectional long short-term memory (BiLSTM) units are used with eleven contexts in a deep neural network.