AttaCut: A Fast and Accurate Neural Thai Word Segmenter

16 Nov 2019Pattarawat ChormaiPonrawee PrasertsomAttapol Rutherford

Word segmentation is a fundamental pre-processing step for Thai Natural Language Processing. The current off-the-shelf solutions are not benchmarked consistently, so it is difficult to compare their trade-offs... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK LEADERBOARD
Thai Word Tokenization BEST-2010 AttaCut-SC F1-Score 0.91 # 1