Multi-Candidate Word Segmentation using Bi-directional LSTM Neural Networks

7 May 2018  ·  Theerapat Lapjaturapit, Kobkrit Viriyayudhakom, Thanaruk Theeramunkong ·

Most existing word segmentation methods output one single segmentation solution. This paper provides an analysis of word segmentation performance when more than one solutions are taken into account. Towards this investigation, a deep neural network with multiple thresholds is applied to generate multiple candidates for segmentation. As a test-bed, the well-known bidirectional long short-term memory (BiLSTM) units are used with eleven contexts in a deep neural network. As performance indices, three measures; recall, precision and f-measure, are plotted with respect to various thresholds for both boundary level and word level evaluation. By a number of experiments, the result shows that the multi-candidate word segmentation can help us increase the recalls while maintaining the precisions.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here