Domain Adaptation of Thai Word Segmentation Models using Stacked Ensemble
Like many Natural Language Processing tasks, Thai word segmentation is domain-dependent. Researchers have been relying on transfer learning to adapt an existing model to a new domain. However, this approach is inapplicable to cases where we can interact with only input and output layers of the models, also known as {``}black boxes{''}. We propose a filter-and-refine solution based on the stacked-ensemble learning paradigm to address this black-box limitation. We conducted extensive experimental studies comparing our method against state-of-the-art models and transfer learning. Experimental results show that our proposed solution is an effective domain adaptation method and has a similar performance as the transfer learning method.
PDF AbstractCode
Datasets
Results from the Paper
Ranked #1 on
Thai Word Segmentation
on WS160
(using extra training data)