TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Chemical Indexing	BC7 NLM-Chem	Rule-based	F1-score (strict)	0.4849	# 1
Entity Linking	BC7 NLM-Chem	Sieve-based+SapBERT	F1-score (strict)	0.8275	# 1
Named Entity Recognition (NER)	BC7 NLM-Chem	PubMedBERT+MLP+CRF	F1-score (strict)	0.8731	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/chemical-identification-and-indexing-in-1/chemical-indexing-on-bc7-nlm-chem)](https://paperswithcode.com/sota/chemical-indexing-on-bc7-nlm-chem?p=chemical-identification-and-indexing-in-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/chemical-identification-and-indexing-in-1/entity-linking-on-bc7-nlm-chem)](https://paperswithcode.com/sota/entity-linking-on-bc7-nlm-chem?p=chemical-identification-and-indexing-in-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/chemical-identification-and-indexing-in-1/named-entity-recognition-on-bc7-nlm-chem)](https://paperswithcode.com/sota/named-entity-recognition-on-bc7-nlm-chem?p=chemical-identification-and-indexing-in-1)`

Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics

Database: The Journal of Biological Databases and Curation 2022 · Tiago Almeida, Rui Antunes, João F. Silva, João R. Almeida, Sérgio Matos ·

The identification of chemicals in articles has attracted a large interest in the biomedical scientific community, given its importance in drug development research. Most of previous research have focused on PubMed abstracts, and further investigation using full-text documents is required because these contain additional valuable information that must be explored. The manual expert task of indexing Medical Subject Headings (MeSH) terms to these articles later helps researchers find the most relevant publications for their ongoing work. The BioCreative VII NLM-Chem track fostered the development of systems for chemical identification and indexing in PubMed full-text articles. Chemical identification consisted in identifying the chemical mentions and linking these to unique MeSH identifiers. This manuscript describes our participation system and the post-challenge improvements we made. We propose a three-stage pipeline that individually performs chemical mention detection, entity normalization and indexing. Regarding chemical identification, we adopted a deep-learning solution that utilizes the PubMedBERT contextualized embeddings followed by a multilayer perceptron and a conditional random field tagging layer. For the normalization approach, we use a sieve-based dictionary filtering followed by a deep-learning similarity search strategy. Finally, for the indexing we developed rules for identifying the more relevant MeSH codes for each article. During the challenge, our system obtained the best official results in the normalization and indexing tasks despite the lower performance in the chemical mention recognition task. In a post-contest phase we boosted our results by improving our named entity recognition model with additional techniques. The final system achieved 0.8731, 0.8275 and 0.4849 in the chemical identification, normalization and indexing tasks, respectively. The code to reproduce our experiments and run the pipeline is publicly available. Database URL: https://github.com/bioinformatics-ua/biocreativeVII_track2

PDF

Code

Add Remove Mark official

bioinformatics-ua/biocreativeVII_tr…

Tasks

Add Remove

Chemical Indexing

Entity Linking

named-entity-recognition

Named Entity Recognition

Named Entity Recognition (NER)

Datasets

BC7 NLM-Chem

Results from the Paper

Add Remove

Ranked #1 on Chemical Indexing on BC7 NLM-Chem

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Chemical Indexing	BC7 NLM-Chem	Rule-based	F1-score (strict)	0.4849	# 1	Compare
Entity Linking	BC7 NLM-Chem	Sieve-based+SapBERT	F1-score (strict)	0.8275	# 1	Compare
Named Entity Recognition (NER)	BC7 NLM-Chem	PubMedBERT+MLP+CRF	F1-score (strict)	0.8731	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove