SciBERT: A Pretrained Language Model for Scientific Text

Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et al., 2018) to address the lack of high-quality, large-scale labeled scientific data... (read more)

PDF Abstract IJCNLP 2019 PDF IJCNLP 2019 Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Sentence Classification ACL-ARC SciBERT F1 70.98 # 2
Named Entity Recognition BC5CDR SciBERT (Base Vocab) F1 88.11 # 7
Named Entity Recognition BC5CDR SciBERT (SciVocab) F1 88.94 # 6
Relation Extraction ChemProt SciBert (Finetune) F1 83.64 # 1
Relation Extraction ChemProt SciBERT (Base Vocab) F1 73.7 # 5
Participant Intervention Comparison Outcome Extraction EBM-NLP SciBERT (Base Vocab) F1 70.82 # 2
Participant Intervention Comparison Outcome Extraction EBM-NLP SciBERT (SciVocab) F1 71.18 # 1
Dependency Parsing GENIA - LAS SciBERT (Base Vocab) F1 91.26 # 3
Dependency Parsing GENIA - LAS SciBERT (SciVocab) F1 91.41 # 2
Dependency Parsing GENIA - UAS SciBERT (Base Vocab) F1 92.32 # 3
Dependency Parsing GENIA - UAS SciBERT (SciVocab) F1 92.46 # 2
Relation Extraction JNLPBA SciBERT (SciVocab) F1 76.09 # 1
Named Entity Recognition JNLPBA SciBERT (Base Vocab) F1 75.77 # 6
Named Entity Recognition NCBI-disease SciBERT (SciVocab) F1 86.45 # 9
Named Entity Recognition NCBI-disease SciBERT (Base Vocab) F1 86.88 # 8
Sentence Classification Paper Field SciBERT (Base Vocab) F1 64.02 # 2
Sentence Classification Paper Field SciBERT (SciVocab) F1 65.71 # 1
Sentence Classification PubMed 20k RCT SciBERT (Base Vocab) F1 86.81 # 2
Sentence Classification SciCite SciBERT F1 84.9 # 1
Citation Intent Classification SciCite SciBERT F1 84.99 # 1
Sentence Classification ScienceCite SciBERT (SciVocab) F1 84.99 # 1
Sentence Classification ScienceCite SciBERT (Base Vocab) F1 84.43 # 2
Named Entity Recognition SciERC SciBERT (SciVocab) F1 65.5 # 4
Relation Extraction SciERC SciBERT (Base Vocab) F1 74.42 # 2
Relation Extraction SciERC SciBERT (SciVocab) F1 74.64 # 1

Methods used in the Paper


METHOD TYPE
Residual Connection
Skip Connections
Attention Dropout
Regularization
Linear Warmup With Linear Decay
Learning Rate Schedules
Weight Decay
Regularization
GELU
Activation Functions
Dense Connections
Feedforward Networks
Adam
Stochastic Optimization
WordPiece
Subword Segmentation
Softmax
Output Functions
Dropout
Regularization
Multi-Head Attention
Attention Modules
Layer Normalization
Normalization
Scaled Dot-Product Attention
Attention Mechanisms
BERT
Language Models