KLPT – Kurdish Language Processing Toolkit

EMNLP (NLPOSS) 2020  ·  Sina Ahmadi ·

Despite the recent advances in applying language-independent approaches to various natural language processing tasks thanks to artificial intelligence, some language-specific tools are still essential to process a language in a viable manner. Kurdish language is a less-resourced language with a remarkable diversity in dialects and scripts and lacks basic language processing tools. To address this issue, we introduce a language processing toolkit to handle such a diversity in an efficient way. Our toolkit is composed of fundamental components such as text preprocessing, stemming, tokenization, lemmatization and transliteration and is able to get further extended by future developers. The project is publicly available.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here