Benchmark dataset for abstracts and titles of 100,000 ArXiv scientific papers. This dataset contains 10 classes and is balanced (exactly 10,000 per class). The classes include subcategories of computer science, physics, and math.
• Direct link: Download
• Citation:
@inproceedings{farhangi2022protoformer,
title={Protoformer: Embedding Prototypes for Transformers},
author={Farhangi, Ashkan and Sui, Ning and Hua, Nan and Bai, Haiyan and Huang, Arthur and Guo, Zhishan},
booktitle={Advances in Knowledge Discovery and Data Mining: 26th Pacific-Asia Conference, PAKDD 2022, Chengdu, China, May 16--19, 2022, Proceedings, Part I},
pages={447--458},
year={2022}
}
Paper | Code | Results | Date | Stars |
---|