Bootstrapping a Romanian Corpus for Medical Named Entity Recognition

RANLP 2017  ·  Maria Mitrofan ·

Named Entity Recognition (NER) is an important component of natural language processing (NLP), with applicability in biomedical domain, enabling knowledge-discovery from medical texts. Due to the fact that for the Romanian language there are only a few linguistic resources specific to the biomedical domain, it was created a sub-corpus specific to this domain. In this paper we present a newly developed Romanian sub-corpus for medical-domain NER, which is a valuable asset for the field of biomedical text processing. We provide a description of the sub-corpus, informative statistics about data-composition and we evaluate an automatic NER tool on the newly created resource.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here