AG News (AG’s News Corpus) is a subdataset of AG's corpus of news articles constructed by assembling titles and description fields of articles from the 4 largest classes (“World”, “Sports”, “Business”, “Sci/Tech”) of AG’s Corpus. The AG News contains 30,000 training and 1,900 test samples per class.
770 PAPERS • 10 BENCHMARKS
The Yahoo! Answers topic classification dataset is constructed using 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000 and testing samples 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information. Source:github
121 PAPERS • 2 BENCHMARKS
The Medical Abstracts dataset contains 14,438 medical abstracts describing 5 different classes of patient conditions, with all of the dataset being annotated. The dataset is split into training and test sets.
3 PAPERS • 1 BENCHMARK