…The dataset contains 350 real images and 426 segmented foregrounds, in which each real image has one or two segmented foregrounds. Each foreground is associated with 10 synthetic composite images.
2 PAPERS • NO BENCHMARKS YET
…Traits are based on the interpersonal circle proposed by Kiesler, where human relations are divided into 16 segments. Each segment has its opposite side in the circle, such as 'friendly and hostile'.
3 PAPERS • NO BENCHMARKS YET
TAU Urban Acoustic Scenes 2019 development dataset consists of 10-seconds audio segments from 10 acoustic scenes: airport, indoor shopping mall, metro station, pedestrian street, public square, street Each acoustic scene has 1440 segments (240 minutes of audio). The dataset contains in total 40 hours of audio.
13 PAPERS • 2 BENCHMARKS
…There are two common metrics: Detection AUROC and Segmentation (or pixelwise) AUROC Detection (or, classification) methods output single float (anomaly score) per input test image. Segmentation methods output anomaly probability for each pixel. "To assess segmentation performance, we evaluate the relative per-region overlap of the segmentation with the ground truth. We define the true positive rate as the percentage of pixels that were correctly classified as anomalous" [1] Later segmentation metric was improved to balance regions with small and large area, see PRO-AUC
287 PAPERS • 4 BENCHMARKS
SinGAN-Seg-polyps is a synthetic dataset for polyp segmentation consisting of 10,000 synthetic polyps and masks.
1 PAPER • NO BENCHMARKS YET
…It can be applied in multiple tasks, such as object detection, instance segmentation, semantic segmentation, free-space segmentation, and waterline segmentation.
8 PAPERS • 2 BENCHMARKS
The Digital Retinal Images for Vessel Extraction (DRIVE) dataset is a dataset for retinal vessel segmentation. Inside training set, for each image, one manual segmentation by an ophthalmological expert has been applied. Inside testing set, for each image, two manual segmentations have been applied by two different observers, where the first observer segmentation is accepted as the ground-truth for performance evaluation
274 PAPERS • 2 BENCHMARKS
…The dataset was annotated with glaucoma grade in every sample, and macular fovea coordinates as well as optic disc/cup segmentation mask in the fundus image. We invite the medical image analysis community to participate by developing and testing existing and novel automated classification and segmentation methods. GAMMA challenge consists of THREE Tasks: Grading glaucoma using multi-modality data Segmentation of optic disc and cup in fundus images Localization of fovea macula in fundus image
Each episode directory contains word-level and segment-level information of the whole episode and also parallel samples extracted under segments_eng and segments_spa subdirectories.
…This 1.6TB dataset consists of raw-data measurements of ~25,000 slices (155 patients) of anonymized patient knee MRI scans, the corresponding scanner-generated DICOM images, manual segmentations of four Challenge Tracks DICOM Track: The DICOM benchmarking track uses scanner-generated DICOM images as the input for image segmentation and detection tasks. Raw Data Track: The Raw Data benchmarking track uses raw MRI data (i.e. k-space) as the input for image reconstruction, segmentation and detection tasks.
6 PAPERS • NO BENCHMARKS YET
One-Shot Affordance Part Segmentation variant of the UMD dataset. Each object instance in the dataset contains a single image.
1 PAPER • 14 BENCHMARKS
MJU-Waste is an RGBD waste object segmentation dataset that is made public to facilitate future research in this area.
3 PAPERS • 1 BENCHMARK
A three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose.
16 PAPERS • NO BENCHMARKS YET
The CLOUD dataset is a set of Optical Coherence Tomography of the Anterior Segment images (AS-OCT) used to the automatic identification and representation of the cornea-contact lens relationship. In particular, the images were obtained by an OCT Cirrus 500 scanner model of Carl Zeiss Meditec with an anterior segment module for users of scleral contact lens (SCL).
The EXPO-HD Dataset is a dataset of Expo whiteboard markers for the purpose of instance segmentation. The dataset contains two subsets (both include instances segmentation labels): Photorealistic synthetic image dataset with 5000 images.
PartImageNet is a large, high-quality dataset with part segmentation annotations. It consists of 158 classes from ImageNet with approximately 24000 images. It can be utilized in multiple vision tasks including but not limited to: Part Discovery, Semantic Segmentation, Few-shot Learning.
15 PAPERS • NO BENCHMARKS YET
LVIS is a dataset for long tail instance segmentation. It has annotations for over 1000 object categories in 164k images.
434 PAPERS • 14 BENCHMARKS
BigDatasetGAN is a dataset for pixel-wise ImageNet segmentation. It consists of large synthetic datasets from BigGAN & VQGAN.
9 PAPERS • NO BENCHMARKS YET
…The A2D dataset serves as a large-scale testbed for various vision problems: video-level single- and multiple-label actor-action recognition, instance-level object segmentation/co-segmentation, as well as pixel-level actor-action semantic segmentation to name a few.
40 PAPERS • 1 BENCHMARK
…The segments are of varying length, between 3 and 10 seconds long, and in each clip the only visible face in the video and audible sound in the soundtrack belong to a single speaking person. In total, the dataset contains roughly 4700 hours of video segments with approximately 150,000 distinct speakers, spanning a wide variety of people, languages and face poses.
35 PAPERS • NO BENCHMARKS YET
…The dataset consists of the same songs split into 3,223 acoustically homogenous segments of 3 to 16 seconds. The tag labels are annotated in the segment level instead of track level.
…microaneurysm, intraretinal hemorrhage, hard exudate, cotton-wool spot, vitreous hemorrhage, preretinal hemorrhage, neovascularization and fibrous proliferation; Over 34K expert-labeled pixel-level lesion segments ; Multi-task, i.e., lesion segmentation, lesion classification, and DR grading.
…It is used for 3D axon instance segmentation of brain cortical regions. The authors proofread over 18,000 axon instances to provide dense 3D axon instance segmentation, enabling large-scale evaluation of axon reconstruction methods.
SegTrack v2 is a video segmentation dataset with full pixel-level annotations on multiple objects at each frame within each video.
102 PAPERS • 4 BENCHMARKS
REFUGE Challenge provides a data set of 1200 fundus images with ground truth segmentations and clinical glaucoma labels, currently the largest existing one.
13 PAPERS • 5 BENCHMARKS
gRefCOCO is the first large-scale Generalized Referring Expression Segmentation dataset that contains multi-target, no-target, and single-target expressions.
20 PAPERS • 2 BENCHMARKS
SA-1B consists of 11M diverse, high resolution, licensed, and privacy protecting images and 1.1B high-quality segmentation masks.
89 PAPERS • NO BENCHMARKS YET
The SPOT dataset contains 197 reviews originating from the Yelp'13 and IMDB collections ([1][2]), annotated with segment-level polarity labels (positive/neutral/negative). produced by a state-of-the-art RST parser This dataset is intended to aid sentiment analysis research and, in particular, the evaluation of methods that attempt to predict sentiment on a fine-grained, segment-level
The LUNA16 (LUng Nodule Analysis) dataset is a dataset for lung segmentation. It consists of 1,186 lung nodules annotated in 888 CT scans.
83 PAPERS • 1 BENCHMARK
…It is used for semantic segmentation.
24 PAPERS • 1 BENCHMARK
DAVIS17 is a dataset for video object segmentation. It contains a total of 150 videos - 60 for training, 30 for validation, 60 for testing
270 PAPERS • 11 BENCHMARKS
The Beijing Traffic Dataset collects traffic speeds at 5-minute granularity for 3126 roadway segments in Beijing between 2022/05/12 and 2022/07/25.
1 PAPER • 1 BENCHMARK
BiasCorp is a dataset for racism detection containing 139,090 comments and news segment from three specific sources - Fox News, BreitbartNews and YouTube.
Northumberland Dolphin Dataset 2020 (NDD20) is a challenging image dataset annotated for both coarse and fine-grained instance segmentation and categorisation. NDD20 contains a large collection of above and below water images of two different dolphin species for traditional coarse and fine-grained segmentation.
4 PAPERS • NO BENCHMARKS YET
To perform universal event stream segmentation, we collected a large-scale RGB-Event dataset for event-centric segmentation, from current available pixel-level aligned datasets (VisEvent, COESOT), namely
8 PAPERS • 1 BENCHMARK
The semantic segmentation of clothes is a challenging task due to the wide variety of clothing styles, layers and shapes. To ensure the high quality of the dataset, all images were manually annotated at the pixel level using JS Segment Annotator, 2 a free web-based image annotation tool.
SemanticUSL is a dataset for domain adaptation for LiDAR point cloud semantic segmentation. The dataset has the same data format and ontology as SemanticKITTI.
Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the models' capabilities is hampered by the lack of a To mitigate the limitations, we present AMOS, a large-scale, diverse, clinical dataset for abdominal organ segmentation. multi-center, multi-vendor, multi-modality, multi-phase, multi-disease patients, each with voxel-level annotations of 15 abdominal organs, providing challenging examples and test-bed for studying robust segmentation We further benchmark several state-of-the-art medical segmentation models to evaluate the status of the existing methods on this new challenging dataset.
51 PAPERS • 1 BENCHMARK
…Usage: 2D/3D image segmentation Format: HDF5 Libraries to read HDF5 files: 1) silx: https://github.com/silx-kit/silx 2) h5py: https://www.h5py.org 3) pymicro: https://github.com/heprom/pymicro Trained models to segment this dataset: https://doi.org/10.5281/zenodo.4601560 Please cite us as @ARTICLE{10.3389/fmats.2021.761229, AUTHOR={Bertoldo, João P. C. and Decencière, Etienne and Ryckelynck, David and Proudhon, Henry}, TITLE={A Modular U-Net for Automated Segmentation of X-Ray Tomography Images in Composite Materials}, JOURNAL={Frontiers in
…In FUNSD and CORD, segment layout annotations are aligned with labeled entities, which makes them not reflect the reading order issue of NER on scanned VrDs, and thus are unsuitable for evaluating current Their segment layout annotations are aligned with real-world situations and entity mentions are labeled on words. The proposed FUNSD-r consists of 199 document samples including the image, layout annotation of segments and words, and labeled entities of 3 categories.
…In FUNSD and CORD, segment layout annotations are aligned with labeled entities, which makes them not reflect the reading order issue of NER on scanned VrDs, and thus are unsuitable for evaluating current Their segment layout annotations are aligned with real-world situations and entity mentions are labeled on words. The proposed CORD-r consists of 999 document samples including the image, layout annotation of segments and words, and labeled entities of 30 categories.
Kvasir-SEG is an open-access dataset of gastrointestinal polyp images and corresponding segmentation masks, manually annotated by a medical doctor and then verified by an experienced gastroenterologist
140 PAPERS • 3 BENCHMARKS
Risk-Aware Planning is a dataset that contains the overhead images and their semantic segmentation captured by a drone from the CityEnviron environment in AirSim simulator.
…This work proposes an unsupervised approach to construct speech-to-speech corpus, aligned on short segment levels, to produce a parallel speech corpus in the source- and target- languages. Our methodology exploits video frames, speech recognition, machine translation, and noisy frames removal algorithms to match segments in both languages.
…for each object: 600 12 megapixel images, sampling the viewing hemisphere 600 registered RGB-D point clouds from a Carmine 1.09 sensor Pose information for each of the above images and point clouds Segmentation masks for each of the above images (and segmented point clouds) Merged point clouds consisting of data from all 600 viewpoints Reconstructed meshes from the merged point clouds Paper: ICRA 2014 "A Large-Scale
MapAI: Precision in Building Segmentation Dataset The dataset comprises 7500 training images and 1500 validation images from Denmark. The test dataset is split into two tasks, where the first task (1368 images) is to segment the buildings only using aerial images.
0 PAPER • NO BENCHMARKS YET
…The data set consists of approximately 380,000 video segments about 19s long, automatically selected to feature objects in natural settings without editing or post-processing, with a recording quality All video segments were human-annotated with high-precision classification labels and bounding boxes at 1 frame per second.
7 PAPERS • 1 BENCHMARK