Paper tables with annotated results for A Sub-Character Architecture for Korean Language Processing

Paper

A Sub-Character Architecture for Korean Language Processing

We introduce a novel sub-character architecture that exploits a unique compositional structure of the Korean language. Our method decomposes each character into a small set of primitive phonetic units called jamo letters from which character- and word-level representations are induced. The jamo letters divulge syntactic and semantic information that is difficult to access with conventional character-level units. They greatly alleviate the data sparsity problem, reducing the observation space to 1.6% of the original while increasing accuracy in our experiments. We apply our architecture to dependency parsing and achieve dramatic improvement over strong lexical baselines.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

A Sub-Character Architecture for Korean Language Processing

Reader Guidelines

Editor Guidelines