SHR++: An Interface for Morpho-syntactic Annotation of Sanskrit Corpora

We propose a web-based annotation framework, SHR++, for morpho-syntactic annotation of corpora in Sanskrit. SHR++ is designed to generate annotations for the word-segmentation, morphological parsing and dependency analysis tasks in Sanskrit. It incorporates analyses and predictions from various tools designed for processing texts in Sanskrit, and utilise them to ease the cognitive load of the human annotators. Specifically, SHR++ uses Sanskrit Heritage Reader, a lexicon driven shallow parser for enumerating all the phonetically and lexically valid word splits along with their morphological analyses for a given string. This would help the annotators in choosing the solutions, rather than performing the segmentations by themselves. Further, predictions from a word segmentation tool are added as suggestions that can aid the human annotators in their decision making. Our evaluation shows that enabling this segmentation suggestion component reduces the annotation time by 20.15 {\%}. SHR++ can be accessed online at http://vidhyut97.pythonanywhere.com/ and the codebase, for the independent deployment of the system elsewhere, is hosted at https://github.com/iamdsc/smart-sanskrit-annotator.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here