BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks

13 Aug 2019 Shreyas Sharma Ron Daniel Jr

Biomedical Named Entity Recognition (NER) is a challenging problem in biomedical information processing due to the widespread ambiguity of out of context terms and extensive lexical variations. Performance on bioNER benchmarks continues to improve due to advances like BERT, GPT, and XLNet... (read more)

PDF Abstract

Results from the Paper


 Ranked #1 on Named Entity Recognition on Species-800 (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
BENCHMARK
Named Entity Recognition BC5CDR BioFLAIR F1 89.42 # 3
Named Entity Recognition JNLPBA BioFLAIR F1 77.03 # 3
Named Entity Recognition LINNAEUS BioFLAIR F1 87.02 # 2
Named Entity Recognition NCBI-disease BioFLAIR F1 88.85 # 3
Named Entity Recognition Species-800 BioFLAIR F1 82.44 # 1

Methods used in the Paper


METHOD TYPE
Cosine Annealing
Learning Rate Schedules
Sigmoid Activation
Activation Functions
Tanh Activation
Activation Functions
Discriminative Fine-Tuning
Fine-Tuning
Linear Warmup With Cosine Annealing
Learning Rate Schedules
SentencePiece
Tokenizers
BPE
Subword Segmentation
GPT
Transformers
XLNet
Transformers
Residual Connection
Skip Connections
Attention Dropout
Regularization
Linear Warmup With Linear Decay
Learning Rate Schedules
Weight Decay
Regularization
GELU
Activation Functions
Dense Connections
Feedforward Networks
Adam
Stochastic Optimization
WordPiece
Subword Segmentation
Dropout
Regularization
LSTM
Recurrent Neural Networks
BiLSTM
Bidirectional Recurrent Neural Networks
Multi-Head Attention
Attention Modules
Softmax
Output Functions
ELMo
Word Embeddings
Layer Normalization
Normalization
Scaled Dot-Product Attention
Attention Mechanisms
BERT
Language Models