UTCD (Universal Text Classification Dataset)

Introduced by Clarke et al. in Label Agnostic Pre-training for Zero-shot Text Classification

UTCD is a compilation of 18 classification datasets spanning 3 categories of Sentiment, Intent/Dialogue, and Topic classification. UTCD focuses on the task of zero-shot text classification where the candidate labels are descriptive of the text being classified. UTCD consists of ~ 6M/800K train/test examples.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages