Development of Word Embeddings for Uzbek Language
In this paper, we share the process of developing word embeddings for the Cyrillic variant of the Uzbek language. The result of our work is the first publicly available set of word vectors trained on the word2vec, GloVe, and fastText algorithms using a high-quality web crawl corpus developed in-house. The developed word embeddings can be used in many natural language processing downstream tasks.
PDF AbstractTasks
Datasets
Add Datasets
introduced or used in this paper
Results from the Paper
Submit
results from this paper
to get state-of-the-art GitHub badges and help the
community compare results to other papers.