A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining

The amount of biomedical literature is vast and growing quickly, and accurate text mining techniques could help researchers to efficiently extract useful information from the literature. However, existing named entity recognition models used by text mining tools such as tmTool and ezTag are not effective enough, and cannot accurately discover new entities. Also, the traditional text mining tools do not consider overlapping entities, which are frequently observed in multi-type named entity recognition results. We propose a neural biomedical named entity recognition and multi-type normalization tool called BERN. The BERN uses high-performance BioBERT named entity recognition models which recognize known entities and discover new entities. Also, probability-based decision rules are developed to identify the types of overlapping entities. Furthermore, various named entity normalization models are integrated into BERN for assigning a distinct identifier to each recognized entity. The BERN provides a Web service for tagging entities in PubMed articles or raw text. Researchers can use the BERN Web service for their text mining tasks, such as new named entity discovery, information retrieval, question answering, and relation extraction. The application programming interfaces and demonstrations of BERN are publicly available at https://bern.korea.ac.kr

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Named Entity Recognition (NER) BC2GM BERN F1 83.4 # 12
Named Entity Recognition (NER) BC4CHEMD BERN F1 91.2 # 5
Named Entity Recognition (NER) LINNAEUS BERN F1 88.0 # 4
Named Entity Recognition (NER) NCBI-disease BERN F1 88.3 # 13

Methods


No methods listed for this paper. Add relevant methods here