Building a Kannada POS Tagger Using Machine Learning and Neural Network Models

9 Aug 2018  Â·  Ketan Kumar Todi, Pruthwik Mishra, Dipti Misra Sharma ·

POS Tagging serves as a preliminary task for many NLP applications. Kannada is a relatively poor Indian language with very limited number of quality NLP tools available for use. An accurate and reliable POS Tagger is essential for many NLP tasks like shallow parsing, dependency parsing, sentiment analysis, named entity recognition. We present a statistical POS tagger for Kannada using different machine learning and neural network models. Our Kannada POS tagger outperforms the state-of-the-art Kannada POS tagger by 6%. Our contribution in this paper is three folds - building a generic POS Tagger, comparing the performances of different modeling techniques, exploring the use of character and word embeddings together for Kannada POS Tagging.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
POS Kannada Treebank Conditional Random Field Average F1 0.91 # 1

Methods