TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
AbbreviationDetection	PLOD-filtered	RoBERTa-large	F1-Score (AC)	92.0	# 1
AbbreviationDetection	PLOD-filtered	RoBERTa-large	F1-Score (LF)	89.8	# 1
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/roberta-large-finetuned-abbr	F1-Score (LF)	89.8	# 1
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/roberta-large-finetuned-abbr	F1-Score (AC)	92.2	# 2
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/albert-large-v2-finetuned-abbDet	F1-Score (LF)	87.2	# 2
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/albert-large-v2-finetuned-abbDet	F1-Score (AC)	90.7	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/plod-an-abbreviation-detection-dataset-for/abbreviationdetection-on-plod-filtered)](https://paperswithcode.com/sota/abbreviationdetection-on-plod-filtered?p=plod-an-abbreviation-detection-dataset-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/plod-an-abbreviation-detection-dataset-for/abbreviationdetection-on-plod-an-abbreviation)](https://paperswithcode.com/sota/abbreviationdetection-on-plod-an-abbreviation?p=plod-an-abbreviation-detection-dataset-for)`

PLOD: An Abbreviation Detection Dataset for Scientific Documents

LREC 2022 · Leonardo Zilio, Hadeel Saadany, Prashant Sharma, Diptesh Kanojia, Constantin Orăsan ·

The detection and extraction of abbreviations from unstructured texts can help to improve the performance of Natural Language Processing tasks, such as machine translation and information retrieval. However, in terms of publicly available datasets, there is not enough data for training deep-neural-networks-based models to the point of generalising well over data. This paper presents PLOD, a large-scale dataset for abbreviation detection and extraction that contains 160k+ segments automatically annotated with abbreviations and their long forms. We performed manual validation over a set of instances and a complete automatic validation for this dataset. We then used it to generate several baseline models for detecting abbreviations and long forms. The best models achieved an F1-score of 0.92 for abbreviations and 0.89 for detecting their corresponding long forms. We release this dataset along with our code and all the models publicly in https://github.com/surrey-nlp/PLOD-AbbreviationDetection

PDF Abstract LREC 2022 PDF LREC 2022 Abstract

Code

Add Remove Mark official

surrey-nlp/PLOD-AbbreviationDetecti… official

Tasks

Add Remove

AbbreviationDetection

Information Retrieval

Machine Translation

Retrieval

Translation

Datasets

Introduced in the Paper:

PLOD-filtered

PLOD-unfiltered

Used in the Paper:

Acronym Identification

Results from the Paper

Edit

Ranked #1 on AbbreviationDetection on PLOD-unfiltered

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
AbbreviationDetection	PLOD-filtered	RoBERTa-large	F1-Score (AC)	92.0	# 1	Compare
AbbreviationDetection	PLOD-filtered	RoBERTa-large	F1-Score (LF)	89.8	# 1	Compare
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/roberta-large-finetuned-abbr	F1-Score (LF)	89.8	# 1	Compare
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/roberta-large-finetuned-abbr	F1-Score (AC)	92.2	# 2	Compare
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/albert-large-v2-finetuned-abbDet	F1-Score (LF)	87.2	# 2	Compare
AbbreviationDetection	PLOD-unfiltered	surrey-nlp/albert-large-v2-finetuned-abbDet	F1-Score (AC)	90.7	# 1	Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • GELU • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • RoBERTa • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

PLOD: An Abbreviation Detection Dataset for Scientific Documents

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove