KP20k is a large-scale scholarly articles dataset with 528K articles for training, 20K articles for validation and 20K articles for testing.
79 PAPERS • 3 BENCHMARKS
KPTimes is a large-scale dataset of news texts paired with editor-curated keyphrases.
25 PAPERS • 3 BENCHMARKS