The dataset consists of 400 whole-slide images (WSIs) of lymph node sections stained with hematoxylin and eosin (H&E), collected from two medical centers in the Netherlands. The WSIs are stored in a multi-resolution pyramid format, allowing for efficient retrieval of image subregions at different magnification levels. The training set includes two subsets:
157 PAPERS • 1 BENCHMARK
The UCF-Crime dataset is a large-scale dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies including Abuse, Arrest, Arson, Assault, Road Accident, Burglary, Explosion, Fighting, Robbery, Shooting, Stealing, Shoplifting, and Vandalism. These anomalies are selected because they have a significant impact on public safety.
124 PAPERS • 3 BENCHMARKS
To investigate three temporal localization tasks: supervised and weakly-supervised audio-visual event localization, and cross-modality localization.
89 PAPERS • NO BENCHMARKS YET
The Elephant MIL dataset is a benchmark used in multiple instance learning (MIL), which falls under the broader categories of image classification and content-based image retrieval. The task is to determine if an image contains an elephant. Each image is treated as a "bag," and within each bag, the image is segmented into various regions called "instances," represented by feature vectors that capture visual characteristics like color, texture, and shape. A bag is labeled as positive if at least one instance contains an elephant, and negative if none of the instances do. The dataset includes 200 images (bags) with a total of 1220 1220 segments (instances), averaging ~6.1 segments per image. The challenge is that only some segments in a positive image might actually show an elephant, so the goal is to correctly classify the entire image based on these segments. This dataset is widely used to evaluate MIL algorithms, especially in cases where only parts of the data might contain the relev
46 PAPERS • 1 BENCHMARK
DAiSEE is a multi-label video classification dataset comprising of 9,068 video snippets captured from 112 users for recognizing the user affective states of boredom, confusion, engagement, and frustration "in the wild". The dataset has four levels of labels namely - very low, low, high, and very high for each of the affective states, which are crowd annotated and correlated with a gold standard annotation created using a team of expert psychologists.
17 PAPERS • 1 BENCHMARK
17 PAPERS • 3 BENCHMARKS
The Musk dataset describes a set of molecules, and the objective is to detect musks from non-musks. This dataset describes a set of 92 molecules of which 47 are judged by human experts to be musks and the remaining 45 molecules are judged to be non-musks. There are 166 features available that describe the molecules based on the shape of the molecule.
4 PAPERS • 2 BENCHMARKS
The Musk2 dataset is a set of 102 molecules of which 39 are judged by human experts to be musks and the remaining 63 molecules are judged to be non-musks. Each instance corresponds to a possible configuration of a molecule. The 166 features that describe these molecules depend upon the exact shape, or conformation, of the molecule.
4 PAPERS • 1 BENCHMARK
The SPOT dataset contains 197 reviews originating from the Yelp'13 and IMDB collections (1), annotated with segment-level polarity labels (positive/neutral/negative). Annotations have been gathered on 2 levels of granulatiry:
3 PAPERS • NO BENCHMARKS YET
Colorectal Adenoma contains 177 whole slide images (156 contain adenoma) gathered and labelled by pathologists from the Department of Pathology, The Chinese PLA General Hospital.
2 PAPERS • NO BENCHMARKS YET
Wiki-en is an annotated English dataset for domain detection extracted from Wikipedia. It includes texts from 7 different domains: “Business and Commerce” (BUS), “Government and Politics” (GOV), “Physical and Mental Health” (HEA), “Law and Order” (LAW), “Lifestyle” (LIF), “Military” (MIL), and “General Purpose” (GEN).
1 PAPER • NO BENCHMARKS YET
Wiki-zh is an annotated Chinese dataset for domain detection extracted from Wikipedia. It includes texts from 7 different domains: “Business and Commerce” (BUS), “Government and Politics” (GOV), “Physical and Mental Health” (HEA), “Law and Order” (LAW), “Lifestyle” (LIF), “Military” (MIL), and “General Purpose” (GEN). It contains 26,280 documents split into training, validation and test.
This dataset focuses only on the robbery category, presenting a new weakly labelled dataset that contains 486 new real–world robbery surveillance videos acquired from public sources.
0 PAPER • NO BENCHMARKS YET