TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Micro Precision	6.8	# 2
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Micro Recall	8.4	# 2
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Micro F1	7.5	# 2
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Macro Precision	9.5	# 2
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Macro Recall	8.6	# 2
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Macro F1	8.8	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/identification-of-tasks-datasets-evaluation/scientific-results-extraction-on-nlp-tdms-exp)](https://paperswithcode.com/sota/scientific-results-extraction-on-nlp-tdms-exp?p=identification-of-tasks-datasets-evaluation)`

Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction

ACL 2019 · Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, Debasis Ganguly ·

While the fast-paced inception of novel tasks and new datasets helps foster active research in a community towards interesting directions, keeping track of the abundance of research activity in different areas on different datasets is likely to become increasingly difficult. The community could greatly benefit from an automatic system able to summarize scientific results, e.g., in the form of a leaderboard. In this paper we build two datasets and develop a framework (TDMS-IE) aimed at automatically extracting task, dataset, metric and score from NLP papers, towards the automatic construction of leaderboards. Experiments show that our model outperforms several baselines by a large margin. Our model is a first step towards automatic leaderboard construction, e.g., in the NLP domain.

PDF Abstract ACL 2019 PDF ACL 2019 Abstract

Code

Add Remove Mark official

IBM/science-result-extractor official

Tasks

Add Remove

Scientific Results Extraction

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Ranked #2 on Scientific Results Extraction on NLP-TDMS (Exp, arXiv only)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Scientific Results Extraction	NLP-TDMS (Exp, arXiv only)	TDMS-IE	Micro Precision	6.8	# 2	Compare
			Micro Recall	8.4	# 2	Compare
			Micro F1	7.5	# 2	Compare
			Macro Precision	9.5	# 2	Compare
			Macro Recall	8.6	# 2	Compare
			Macro F1	8.8	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Identification of Tasks, Datasets, Evaluation Metrics, and Numeric Scores for Scientific Leaderboards Construction

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove