TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Representation Learning	SciDocs	SciBERT	Avg.	59.6	# 5
Representation Learning	SciDocs	Citeomatic	Avg.	76.0	# 3
Representation Learning	SciDocs	SPECTER	Avg.	80.0	# 2
Citation Prediction	SciDocs (Citation Prediction)	SPECTER	MAP	88.3	# 2
Document Classification	SciDocs (MAG)	SPECTER	F1 (micro)	82.0	# 1
Document Classification	SciDocs (MeSH)	SPECTER	F1 (micro)	86.4	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/document-level-representation-learning-using/document-classification-on-scidocs-mag)](https://paperswithcode.com/sota/document-classification-on-scidocs-mag?p=document-level-representation-learning-using)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/document-level-representation-learning-using/representation-learning-on-scidocs)](https://paperswithcode.com/sota/representation-learning-on-scidocs?p=document-level-representation-learning-using)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/document-level-representation-learning-using/citation-prediction-on-scidocs-citation)](https://paperswithcode.com/sota/citation-prediction-on-scidocs-citation?p=document-level-representation-learning-using)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/document-level-representation-learning-using/document-classification-on-scidocs-mesh)](https://paperswithcode.com/sota/document-classification-on-scidocs-mesh?p=document-level-representation-learning-using)`

SPECTER: Document-level Representation Learning using Citation-informed Transformers

ACL 2020 · Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld ·

Representation learning is a critical ingredient for natural language processing systems. Recent Transformer language models like BERT learn powerful textual representations, but these models are targeted towards token- and sentence-level training objectives and do not leverage information on inter-document relatedness, which limits their document-level representation power. For applications on scientific documents, such as classification and recommendation, the embeddings power strong performance on end tasks. We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph. Unlike existing pretrained language models, SPECTER can be easily applied to downstream applications without task-specific fine-tuning. Additionally, to encourage further research on document-level models, we introduce SciDocs, a new evaluation benchmark consisting of seven document-level tasks ranging from citation prediction, to document classification and recommendation. We show that SPECTER outperforms a variety of competitive baselines on the benchmark.

PDF Abstract ACL 2020 PDF ACL 2020 Abstract

Code

Add Remove Mark official

allenai/specter official

495

allenai/scidocs official

125

allenai/aspire

sntcristian/and-kge

hle027/IR-Competition

Tasks

Add Remove

Citation Prediction

Document Classification

General Classification

Language Modelling

Representation Learning

Sentence

Datasets

Introduced in the Paper:

SciDocs

Results from the Paper

Edit

Ranked #1 on Document Classification on SciDocs (MAG)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Representation Learning	SciDocs	SciBERT	Avg.	59.6	# 5	Compare
Representation Learning	SciDocs	Citeomatic	Avg.	76.0	# 3	Compare
Representation Learning	SciDocs	SPECTER	Avg.	80.0	# 2	Compare
Citation Prediction	SciDocs (Citation Prediction)	SPECTER	MAP	88.3	# 2	Compare
Document Classification	SciDocs (MAG)	SPECTER	F1 (micro)	82.0	# 1	Compare
Document Classification	SciDocs (MeSH)	SPECTER	F1 (micro)	86.4	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • Attention Dropout • BERT • BPE • Dense Connections • Dropout • GELU • Label Smoothing • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Position-Wise Feed-Forward Layer • ReLU • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer • Weight Decay • WordPiece

Edit Social Preview

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove