TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Named Entity Recognition (NER)	BC2GM	KeBioLM	F1	85.1	# 9
Named Entity Recognition (NER)	BC5CDR-chemical	KeBioLM	F1	93.3	# 11
Named Entity Recognition (NER)	BC5CDR-disease	KeBioLM	F1	86.1	# 6
Relation Extraction	ChemProt	KeBioLM	F1	77.5	# 5
Relation Extraction	DDI	KeBioLM	F1	81.9	# 2
Relation Extraction	GAD	KeBioLM	F1	84.3	# 2
Named Entity Recognition (NER)	JNLPBA	KeBioLM	F1	82.0	# 1
Named Entity Recognition (NER)	NCBI-disease	KeBioLM	F1	89.1	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/named-entity-recognition-ner-on-jnlpba)](https://paperswithcode.com/sota/named-entity-recognition-ner-on-jnlpba?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/relation-extraction-on-ddi)](https://paperswithcode.com/sota/relation-extraction-on-ddi?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/relation-extraction-on-gad)](https://paperswithcode.com/sota/relation-extraction-on-gad?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/relation-extraction-on-chemprot)](https://paperswithcode.com/sota/relation-extraction-on-chemprot?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/named-entity-recognition-on-bc5cdr-disease)](https://paperswithcode.com/sota/named-entity-recognition-on-bc5cdr-disease?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/named-entity-recognition-on-bc2gm)](https://paperswithcode.com/sota/named-entity-recognition-on-bc2gm?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/named-entity-recognition-ner-on-ncbi-disease)](https://paperswithcode.com/sota/named-entity-recognition-ner-on-ncbi-disease?p=improving-biomedical-pretrained-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/improving-biomedical-pretrained-language/named-entity-recognition-on-bc5cdr-chemical)](https://paperswithcode.com/sota/named-entity-recognition-on-bc5cdr-chemical?p=improving-biomedical-pretrained-language)`

Improving Biomedical Pretrained Language Models with Knowledge

NAACL (BioNLP) 2021 · Zheng Yuan, Yijia Liu, Chuanqi Tan, Songfang Huang, Fei Huang ·

Pretrained language models have shown success in many natural language processing tasks. Many works explore incorporating knowledge into language models. In the biomedical domain, experts have taken decades of effort on building large-scale knowledge bases. For example, the Unified Medical Language System (UMLS) contains millions of entities with their synonyms and defines hundreds of relations among entities. Leveraging this knowledge can benefit a variety of downstream tasks such as named entity recognition and relation extraction. To this end, we propose KeBioLM, a biomedical pretrained language model that explicitly leverages knowledge from the UMLS knowledge bases. Specifically, we extract entities from PubMed abstracts and link them to UMLS. We then train a knowledge-aware language model that firstly applies a text-only encoding layer to learn entity representation and applies a text-entity fusion encoding to aggregate entity representation. Besides, we add two training objectives as entity detection and entity linking. Experiments on the named entity recognition and relation extraction from the BLURB benchmark demonstrate the effectiveness of our approach. Further analysis on a collected probing dataset shows that our model has better ability to model medical knowledge.

PDF Abstract NAACL (BioNLP) 2021 PDF NAACL (BioNLP) 2021 Abstract

Code

Add Remove Mark official

GanjinZero/KeBioLM official

Tasks

Add Remove

Entity Linking

Language Modelling

named-entity-recognition

Named Entity Recognition

Named Entity Recognition (NER)

Relation

Relation Extraction

Datasets

BC5CDR NCBI Disease BLUE

DDI JNLPBA ChemProt BC2GM GAD

Results from the Paper

Add Remove

Ranked #1 on Named Entity Recognition (NER) on JNLPBA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Named Entity Recognition (NER)	BC2GM	KeBioLM	F1	85.1	# 9	Compare
Named Entity Recognition (NER)	BC5CDR-chemical	KeBioLM	F1	93.3	# 11	Compare
Named Entity Recognition (NER)	BC5CDR-disease	KeBioLM	F1	86.1	# 6	Compare
Relation Extraction	ChemProt	KeBioLM	F1	77.5	# 5	Compare
Relation Extraction	DDI	KeBioLM	F1	81.9	# 2	Compare
Relation Extraction	GAD	KeBioLM	F1	84.3	# 2	Compare
Named Entity Recognition (NER)	JNLPBA	KeBioLM	F1	82.0	# 1	Compare
Named Entity Recognition (NER)	NCBI-disease	KeBioLM	F1	89.1	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Improving Biomedical Pretrained Language Models with Knowledge

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove