DialoGLUE is a natural language understanding benchmark for task-oriented dialogue designed to encourage dialogue research in representation-based transfer, domain adaptation, and sample-efficient task learning. It consisting of 7 task-oriented dialogue datasets covering 4 distinct natural language understanding tasks.
16 PAPERS • 2 BENCHMARKS
HJDataset is a large dataset of Historical Japanese Documents with Complex Layouts. It contains over 250,000 layout element annotations of seven types. In addition to bounding boxes and masks of the content regions, it also includes the hierarchical structures and reading orders for layout elements. The dataset is constructed using a combination of human and machine efforts.
5 PAPERS • NO BENCHMARKS YET
A dataset for evaluating text classification, domain adaptation, and active learning models. The dataset consists of 22,660 documents (tweets) collected in 2018 and 2019. It spans across four domains: Alzheimer's, Parkinson's, Cancer, and Diabetes.
2 PAPERS • NO BENCHMARKS YET