NUS

The dataset was constructed by first finding suitable publications and then collecting keyphrases from manual annotators. Google SOAP API was used to find documents using variants of the query “keywords general terms filetype:pdf”. Over 250 of these PDF documents were downloaded for further processing. Documents were then manually restricted to scientific conference papers, with a length range of 4-12 pages. The PDF documents were then converted to plain text using the PDF995 software suite (as it handled two-columned text better than other programs tried). At the end of this process, 211 documents in plain text format were selected which were converted successfully without problems. The authors then recruited student volunteers from our department to participate in manual keyphrase assignments. Each volunteer was given three PDF files (with author-assigned keyphrases hidden) to assign keyphrases to.

Homepage

Benchmarks

Add a new result Link an existing benchmark

Trend	Task	Dataset Variant	Best Model	Paper	Code
	Keyphrase Extraction	NUS	PromptRank

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Keyphrase Extraction

Similar Datasets

SemEval-2017 Task-10

Inspec

Krapivin

Usage

License

Unknown

NUS

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

SemEval-2017 Task-10

Inspec

Krapivin

Usage

License

Modalities

Languages

NUS

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

SemEval-2017 Task-10

Inspec

Krapivin

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages