A Domain and Language Independent Named Entity Classification Approach Based on Profiles and Local Information

This paper presents a Named Entity Classification system, which employs machine learning. Our methodology employs local entity information and profiles as feature set. All features are generated in an unsupervised manner. It is tested on two different data sets: (i) DrugSemantics Spanish corpus (Overall F1 = 74.92), whose results are in-line with the state of the art without employing external domain-specific resources. And, (ii) English CONLL2003 dataset (Overall F1 = 81.40), although our results are lower than previous work, these are reached without external knowledge or complex linguistic analysis. Last, using the same configuration for the two corpora, the difference of overall F1 is only 6.48 points (DrugSemantics = 74.92 versus CoNLL2003 = 81.40). Thus, this result supports our hypothesis that our approach is language and domain independent and does not require any external knowledge or complex linguistic analysis.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here