R-BERT-CNN: Drug-target interactions extraction from biomedical literature
In this research, we present our work participation for the DrugProt task of BioCreative VII challenge. Drug-target interactions (DTIs) are critical for drug discovery and repurposing, which are often manually extracted from the experimental articles. There are >32M biomedical articles on PubMed and manually extracting DTIs from such a huge knowledge base is challenging. To solve this issue, we provide a solution for Track 1, which aims to extract 10 types of interactions between drug and protein entities. We applied an Ensemble Classifier model that combines BioMed-RoBERTa, a state of art language model, with Convolutional Neural Networks (CNN) to extract these relations. Despite the class imbalances in the BioCreative VII DrugProt test corpus, our model achieves a good performance compared to the average of other submissions in the challenge, with the micro F1 score of 55.67% (and 63% on BioCreative VI ChemProt test corpus). The results show the potential of deep learning in extracting various types of DTIs.
PDF AbstractTask | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
DrugProt | DrugProt | R-BERT-CNN | F1 (micro) | 55.67 | # 1 |