🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task

Filter by Language

676 dataset results for segmentation

Cluttered Omniglot

Dataset for one-shot segmentation.

3 PAPERS • 1 BENCHMARK

DanbooRegion

DanbooRegion is a dataset consists of 5377 in-the-wild illustration downloaded from the Danbooru2018 and region segment map annotation pairs samples are provided as at 1024px 8-bit RGB images, and region segment maps as int-32 index images.

1 PAPER • NO BENCHMARKS YET

ISIC 2017 Task 1

…The Task 1 challenge dataset for lesion segmentation contains 2,000 images for training with ground truth segmentations (2000 binary mask images).

14 PAPERS • NO BENCHMARKS YET

PASCAL VOC 2011

PASCAL VOC 2011 is an image segmentation dataset. It contains around 2,223 images for training, consisting of 5,034 objects. Testing consists of 1,111 images with 2,028 objects. In total there are over 5,000 precisely segmented objects for training.

19 PAPERS • 2 BENCHMARKS

BraTS2019

Multimodal Brain Tumor Segmentation Challenge 2019

1 PAPER • NO BENCHMARKS YET

BraTS2018

Multimodal Brain Tumor Segmentation Challenge 2018

2 PAPERS • NO BENCHMARKS YET

Cityscapes-VPS

…It not only supports video panoptic segmentation (VPS) task, but also provides super-set annotations for video semantic segmentation (VSS) and video instance segmentation (VIS) tasks.

24 PAPERS • 1 BENCHMARK

FetReg

Fetoscopic Placental Vessel Segmentation and Registration (FetReg) is a large-scale multi-centre dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms

2 PAPERS • NO BENCHMARKS YET

Replica

…Each reconstruction has clean dense geometry, high resolution and high dynamic range textures, glass and mirror surface information, planar segmentation as well as semantic class and instance segmentation

287 PAPERS • 3 BENCHMARKS

VIPSeg

A large-scale VIdeo Panoptic Segmentation dataset

21 PAPERS • 1 BENCHMARK

Telegraphic Summaries

Telegraphic Summaries (Gold Corpus for Telegraphic Summarization)

…A segment is defined as a continuous span of words in the source, chosen as a part of the summary. 2. A word should not be fragmented. Eg - if the word "breaking" appears in the source, the entire word should be a part of the segment and not fragments like "break". 3. Each segment should be relevant to the plot, try to advance the story and have some continuity with the preceding and the following segment. 4. Each segment extracted from a dialogue should be enclosed in quotes. 5. Each segment extracted from parentheses should be enclosed in parentheses. 6. Segments should be arranged in the same order as they appear in the story. 7. The summary should be minimal. If multiple segments mean the same thing, pick the shortest.

2 PAPERS • NO BENCHMARKS YET

COIN

…Each video is labelled with 3.91 step segments, where each segment lasts 14.91 seconds on average. In total, the dataset contains videos of 476 hours, with 46,354 annotated segments.

79 PAPERS • 2 BENCHMARKS

ScanNet200

The ScanNet200 benchmark studies 200-class 3D semantic segmentation - an order of magnitude more class categories than previous 3D scene understanding benchmarks. The source of scene data is identical to ScanNet, but parses a larger vocabulary for semantic and instance segmentation

22 PAPERS • 3 BENCHMARKS

HuTics (Human Deictic Gestures Dataset)

…The images are annotated by segmentation masks of the object(s) of interest. The original purpose of the data collection is for gesture-aware object-agnostic segmentation tasks.

2 PAPERS • NO BENCHMARKS YET

VocalFolds

The Vocal Folds dataset is a dataset for automatic segmentation of laryngeal endoscopic images. The dataset consists of 8 sequences from 2 patients containing 536 hand segmented in vivo colour images of the larynx during two different resection interventions with a resolution of 512x512 pixels.

3 PAPERS • NO BENCHMARKS YET

BraTS 2014

BRATS 2014 is a brain tumor segmentation dataset.

5 PAPERS • 1 BENCHMARK

DAVIS-585

A dataset for interactive segmentation with simulated initial masks.

4 PAPERS • 1 BENCHMARK

ACDC (Adverse Conditions Dataset with Correspondences)

ACDC (Adverse Conditions Dataset with Correspondences) (Adverse Conditions Dataset with Correspondences)

We introduce ACDC, the Adverse Conditions Dataset with Correspondences for training and testing semantic segmentation methods on adverse visual conditions. ACDC supports two tasks: 1. standard semantic segmentation 2. uncertainty-aware semantic segmentation

23 PAPERS • 4 BENCHMARKS

ACDC Scribbles

…The ACDC dataset contains cardiac MRI images, paired with hand-made segmentation masks. It is possible to use the segmentation masks provided in the ACDC dataset to evaluate the performance of methods trained using only scribble supervision. References: 1 Bernard, Olivier, et al. "Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved?." IEEE transactions on medical imaging 37.11 (2018): 2514-2525.

9 PAPERS • 1 BENCHMARK

MSP-Podcast (A large naturalistic speech emotional dataset)

The MSP-Podcast corpus contains speech segments from podcast recordings which are perceptually annotated using crowdsourcing. The collection of this corpus is an ongoing process. Most of the segments in a regular podcasts are neutral. We use machine learning techniques trained with available data to retrieve candidate segments. These segments are emotionally annotated with crowdsourcing. This approach allows us to spend our resources on speech segments that are likely to convey emotions.

3 PAPERS • 4 BENCHMARKS

ATLAS v2.0 (Anatomical Tracings of Lesions After Stroke Dataset version 2.0)

Accurate lesion segmentation is critical in stroke rehabilitation research for the quantification of lesion burden and accurate image processing. Current automated lesion segmentation methods for T1-weighted (T1w) MRIs, commonly used in rehabilitation research, lack accuracy and reliability. Manual segmentation remains the gold standard, but it is time-consuming, subjective, and requires significant neuroanatomical expertise. Here we present ATLAS v2.0 (N=1271), a larger dataset of T1w stroke MRIs and manually segmented lesion masks that includes training (public. n=655), test (masks hidden, n=300), and generalizability (completely Algorithm development using this larger sample should lead to more robust solutions, and the hidden test and generalizability datasets allow for unbiased performance evaluation via segmentation challenges

6 PAPERS • 1 BENCHMARK

All-day CityScapes

We design an all-day semantic segmentation benchmark all-day CityScapes. It is the first semantic segmentation benchmark that contains samples from all-day scenarios, i.e., from dawn to night.

3 PAPERS • 1 BENCHMARK

Student Essay

Student Essay is widely used in research on argument segmentation

1 PAPER • NO BENCHMARKS YET

MSeg

A composite dataset that unifies semantic segmentation datasets from different domains.

18 PAPERS • NO BENCHMARKS YET

DCASE 2019 Mobile (TAU Urban Acoustic Scenes 2019 Mobile)

TAU Urban Acoustic Scenes 2019 Mobile development dataset consists of 10-seconds audio segments from 10 acoustic scenes: Airport Indoor shopping mall Metro station Pedestrian street Public square Street Each acoustic scene has 1440 segments (240 minutes of audio) recorded with device A (main device) and 108 segments of parallel audio (18 minutes) each recorded with devices B and C.

5 PAPERS • 1 BENCHMARK

DELIVER

DELIVER is an arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. It is designed for the tasks of arbitrary-modal semantic segmentation.

7 PAPERS • 2 BENCHMARKS

ScribbleKITTI

ScribbleKITTI is a scribble-annotated dataset for LiDAR semantic segmentation.

16 PAPERS • 2 BENCHMARKS

BraTS 2017

…The segmentation evaluation is based on three tasks: WT, TC and ET segmentation.

73 PAPERS • 1 BENCHMARK

MyoPS

MyoPS is a dataset for myocardial pathology segmentation combining three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020 The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segment

5 PAPERS • NO BENCHMARKS YET

MinneApple

MinneApple is a benchmark dataset for apple detection and segmentation. The fruits are labelled using polygonal masks for each object instance to aid in precise object detection, localization, and segmentation.

15 PAPERS • NO BENCHMARKS YET

Toloka WaterMeters

…The archive also includes the pictures of the results of segmentation with the masks and collages. Toloka was used for photo capturing, segmentation, and recognizing the readings.

0 PAPER • NO BENCHMARKS YET

NightCity

The largest real-world night-time semantic segmentation dataset with pixel-level labels.

9 PAPERS • NO BENCHMARKS YET

Polarimetric Imaging for Perception

…data includes synchronized and aligned samples of the following: angle of linear polarization (AoLP) images, degree of linear polarization (DoLP) images, RGB images, lidar scans, ground truth free space segmentation (road segmentation), GNSS / IMU readings (vehicle location, vehicle orientation, vehicle speed, vehicle acceleration, etc.) and calibration matrices. Additionally, the dataset includes free space segmentation of 8,141 images.

1 PAPER • NO BENCHMARKS YET

CDS2K

…It is a concealed defect segmentation dataset from the five well-known defect segmentation databases. It contains five sub-databases: MVTecAD, NEU, CrackForest, KolektorSDD, and MagneticTile.

4 PAPERS • NO BENCHMARKS YET

BraTS 2015

The BraTS 2015 dataset is a dataset for brain tumor image segmentation. It consists of 220 high grade gliomas (HGG) and 54 low grade gliomas (LGG) MRIs. Segmented “ground truth” is provide about four intra-tumoral classes, viz. edema, enhancing tumor, non-enhancing tumor, and necrosis.

66 PAPERS • 1 BENCHMARK

destruction (desctruction detection dataset)

This dataset contains pre and post destruction images and also segmentation labels for test images.

2 PAPERS • NO BENCHMARKS YET

Famulus

This is a dataset for segmentation and classification of epistemic activities in diagnostic reasoning texts.

3 PAPERS • NO BENCHMARKS YET

York Urban Line Segment Database

The York Urban Line Segment Database is a compilation of 102 images (45 indoor, 57 outdoor) of urban environments consisting mostly of scenes from the campus of York University and downtown Toronto, Canada Each image in the database has been hand-labelled to identify the set of line segments satisfying the “Manhattan assumption” (Coughlan & Yuille 2003), i.e., the set of line segments that conform to the The database provides the original images, camera calibration parameters, ground truth line segments, and estimated Manhattan frame relative to the camera for each image.

15 PAPERS • 2 BENCHMARKS

ImageTBAD

A dataset of A 3D Computed Tomography (CT) image dataset, ImageTBAD, for segmentation of Type-B Aortic Dissection is published. The segmentation labeling is performed by a team of two cardiovascular radiologists who have extensive experience with TBAD. The segmentation labeling of each patient is fulfilled by one radiologist and checked by the other. The segmentation

1 PAPER • NO BENCHMARKS YET

How2R

…Mechanical Turk (AMT) is used to collect annotations on HowTo100M videos. 30k 60-second clips are randomly sampled from 9,421 videos and present each clip to the turkers, who are asked to select a video segment After this segment selection step, another group of workers are asked to write descriptions for each displayed segment. These final video segments are 10-20 seconds long on average, and the length of queries ranges from 8 to 20 words.

9 PAPERS • NO BENCHMARKS YET

Youtube-VIS 2022 Validation

Video object segmentation has been studied extensively in the past decade due to its importance in understanding video spatial-temporal structures as well as its value in industrial applications. Previously, we presented the first large-scale video object segmentation dataset named YouTubeVOS and hosted the Large-scale Video Object Segmentation Challenge in conjuction with ECCV 2018, ICCV 2019 This year, we are thrilled to invite you to the 4th Large-scale Video Object Segmentation Challenge in conjunction with CVPR 2022.

5 PAPERS • 1 BENCHMARK

Waymo Open Dataset

The Waymo Open Dataset currently contains 1,950 segments. The authors plan to grow this dataset in the future. Currently the datasets includes: 1,950 segments of 20s each, collected at 10Hz (390,000 frames) in diverse geographies and conditions Sensor data 1 mid-range lidar 4 short-range lidars 5 cameras (front data Lidar to camera projections Sensor calibrations and vehicle poses Labeled data Labels for 4 object classes - Vehicles, Pedestrians, Cyclists, Signs High-quality labels for lidar data in 1,200 segments 12.6M 3D bounding box labels with tracking IDs on lidar data High-quality labels for camera data in 1,000 segments 11.8M 2D bounding box labels with tracking IDs on camera data

380 PAPERS • 12 BENCHMARKS

CheXlocalize

CheXlocalize is a radiologist-annotated segmentation dataset on chest X-rays. The dataset consists of two types of radiologist annotations for the localization of 10 pathologies: pixel-level segmentations and most-representative points. The dataset also consists of two separate sets of radiologist annotations: (1) ground-truth pixel-level segmentations on the validation and test sets, drawn by two board-certified radiologists, and (2) benchmark pixel-level segmentations and most-representative points on the test set, drawn by a separate group of three board-certified radiologists.

1 PAPER • NO BENCHMARKS YET

UPLight

UPLight is an underwater RGB-Polarization multimodal semantic segmentation dataset with 12 typical underwater semantic classes.

5 PAPERS • NO BENCHMARKS YET

Synthinel-1

Synthinel-1 is a collection of synthetic overhead imagery with full pixel-wise building segmentation labels.

4 PAPERS • NO BENCHMARKS YET

PanNuke

PanNuke is a semi automatically generated nuclei instance segmentation and classification dataset with exhaustive nuclei labels across 19 different tissue types. In total the dataset contains 205,343 labeled nuclei, each with an instance segmentation mask.

44 PAPERS • 3 BENCHMARKS

ZJU-RGB-P

Research on semantic segmentation of traffic scenes using color and polarization information (including training and testing sets).

5 PAPERS • 1 BENCHMARK

CareerCoach 2022

…text segmentation and text segment classification) tasks and comprises 169 documents and gold standard annotations for page segments Partition (P2) contains 75 documents with a significantly richer

1 PAPER • NO BENCHMARKS YET

Datasets

676 dataset results for segmentation