Improving Biomedical Named Entity Recognition with Syntactic Information

Biomedical named entity recognition (BioNER) is an important task for understanding biomedical texts, which can be challenging due to the lack of large-scale labeled training data and domain knowledge. To address the challenge, in addition to using powerful encoders (e.g., biLSTM and BioBERT), one possible method is to leverage extra knowledge that is easy to obtain. Previous studies have shown that auto-processed syntactic information can be a useful resource to improve model performance, but their approaches are limited to directly concatenating the embeddings of syntactic information to the input word embeddings. Therefore, such syntactic information is leveraged in an inflexible way, where inaccurate one may hurt model performance. In this paper, we propose BIOKMNER, a BioNER model for biomedical texts with key-value memory networks (KVMN) to incorporate auto-processed syntactic information. We evaluate BIOKMNER on six English biomedical datasets, where our method with KVMN outperforms the strong baseline method, namely, BioBERT, from the previous study on all datasets. Specifically, the F1 scores of our best performing model are 85.29% on BC2GM, 77.83% on JNLPBA, 94.22% on BC5CDR-chemical, 90.08% on NCBI-disease, 89.24% on LINNAEUS, and 76.33% on Species-800, where state-of-the-art performance is obtained on four of them (i.e., BC2GM, BC5CDR-chemical, NCBI-disease, and Species-800). The experimental results on six English benchmark datasets demonstrate that auto-processed syntactic information can be a useful resource for BioNER and our method with KVMN can appropriately leverage such information to improve model performance.

Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Named Entity Recognition (NER) BC2GM BioKMNER + BioBERT F1 85.29 # 6
Named Entity Recognition (NER) BC5CDR-chemical BioKMNER + BioBERT F1 94.22 # 8
Named Entity Recognition (NER) JNLPBA BioKMNER + BioBERT F1 77.83 # 10
Named Entity Recognition (NER) NCBI-disease BioKMNER + BioBERT F1 90.08 # 2
Named Entity Recognition (NER) Species-800 BioKMNER + BioBERT F1 76.33 # 1


No methods listed for this paper. Add relevant methods here