Multimodal material segmentation (MCubeS) dataset contains 500 sets of images from 42 street scenes. Each scene has images for four modalities: RGB, angle of linear polarization (AoLP), degree of linear polarization (DoLP), and near-infrared (NIR). The dataset provides annotated ground truth labels for both material and semantic segmentation for every pixel. The dataset is divided training set with 302 image sets, validation set with 96 image sets, and test set with 102 image sets. Each image has 1224 x 1024 pixels and a total of 20 class labels per pixel.
10 PAPERS • 1 BENCHMARK
MARIDA (Marine Debris Archive) is the first dataset based on the multispectral Sentinel-2 (S2) satellite data, which distinguishes Marine Debris from various marine features that co-exist, including Sargassum macroalgae, Ships, Natural Organic Material, Waves, Wakes, Foam, dissimilar water types (i.e., Clear, Turbid Water, Sediment-Laden Water, Shallow Water), and Clouds. MARIDA is an open-access dataset which enables the research community to explore the spectral behaviour of certain floating materials, sea state features and water types, to develop and evaluate Marine Debris detection solutions based on artificial intelligence and deep learning architectures, as well as satellite pre-processing pipelines. Although it is designed to be beneficial for several machine learning tasks, it primarily aims to benchmark weakly supervised pixel-level semantic segmentation learning methods.
6 PAPERS • 1 BENCHMARK
This dataset contains a collection of 235800 X-ray projections of 131 pieces of modeling clay (Play-Doh) with various numbers of stones inserted. The dataset is intended as an extensive and easy-to-use training dataset for supervised machine learning driven object detection. The ground truth locations of the stones are included.
1 PAPER • NO BENCHMARKS YET
EBHI-Seg is a dataset containing 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer.
Stack of 2D gray images of glass fiber-reinforced polyamide 66 (GF-PA66) 3D X-ray Computed Tomography (XCT) specimen.
We present a comprehensive dataset comprising a vast collection of raw mineral samples for the purpose of mineral recognition. The dataset encompasses more than 5,000 distinct mineral species and incorporates subsets for zero-shot and few-shot learning. In addition to the samples themselves, some entries in the dataset are accompanied by supplementary natural language descriptions, size measurements, and segmentation masks. For detailed information on each sample, please refer to the minerals_full.csv file.
Raw-Microscopy:
UV6K is a high-resolution remote sensing urban vehicle segmentation dataset.
1 PAPER • 1 BENCHMARK
MapAI: Precision in Building Segmentation Dataset The dataset comprises 7500 training images and 1500 validation images from Denmark. The test dataset is split into two tasks, where the first task (1368 images) is to segment the buildings only using aerial images. In contrast, the second task (978 images) allows using aerial images and lidar data. All data samples have a resolution of 500x500. The aerial images are RGB images, while the lidar data are rasterized. The ground truth masks have two classes, building, and background.
0 PAPER • NO BENCHMARKS YET
Standardized Multi-Channel Dataset for Glaucoma (SMDG-19) is a collection and standardization of 19 public datasets, comprised of full-fundus glaucoma images, associated image metadata like, optic disc segmentation, optic cup segmentation, blood vessel segmentation, and any provided per-instance text metadata like sex and age. This dataset is the largest public repository of fundus images with glaucoma.