SA-1B consists of 11M diverse, high resolution, licensed, and privacy protecting images and 1.1B high-quality segmentation masks.
157 PAPERS • 1 BENCHMARK
For the details of the work, the readers are refer to the paper "Feature Pyramid and Hierarchical Boosting Network for Pavement Crack Detection" (FPHB), T-ITS 2019. You can find the paper in https://www.researchgate.net/publication/330244656_Feature_Pyramid_and_Hierarchical_Boosting_Network_for_Pavement_Crack_Detection or https://arxiv.org/abs/1901.06340.
32 PAPERS • NO BENCHMARKS YET
Manual crown delineation of individual trees in two countries: Denmark and Finland.
5 PAPERS • NO BENCHMARKS YET
The dataset offers tag and mask annotations for image-text pairs from the CC3M validation set. Tag annotations denote words that aptly describe the relationship between the image and the corresponding text. These annotations provide valuable insights into the semantic connection between each pair's visual and textual elements.
5 PAPERS • 2 BENCHMARKS
MMFlood is remote sensing dataset derived from Sentinel-1 (VV-VH), MapZen (DEM) and OpenStreetMap (Hydrography). It provides a complete and well-rounded set of data specifically designed for flood events, focusing on three main features: worldwide distribution, manually validated annotations and multiple modalities.
5 PAPERS • 1 BENCHMARK
CrackForest Dataset is an annotated road crack image database which can reflect urban road surface condition in general.
4 PAPERS • 1 BENCHMARK
We present the CrackVision12k dataset, a collection of 12,000 crack images derived from 13 publicly available crack datasets. The individual datasets were too small to effectively train a deep learning model. Moreover, the masks in each dataset were annotated using different standards, so unifying the annotations was necessary. To achieve this, we applied various image processing techniques to each dataset to create masks that follow a consistent standard.
The George B. Moody PhysioNet Challenges are annual competitions that invite participants to develop automated approaches for addressing important physiological and clinical problems. The 2024 Challenge invites teams to develop algorithms for digitizing and classifying electrocardiograms (ECGs) captured from images or paper printouts. Despite the recent advances in digital ECG devices, physical or paper ECGs remain common, especially in the Global South. These physical ECGs document the history and diversity of cardiovascular diseases (CVDs), and algorithms that can digitize and classify these images have the potential to improve our understanding and treatment of CVDs, especially for underrepresented and underserved populations.
From my knowledge, the dataset used in the project is the largest crack segmentation dataset so far. It contains around 11.200 images that are merged from 12 available crack segmentation datasets.
4 PAPERS • 2 BENCHMARKS
Extension of the PASTIS benchmark with radar and optical image time series.
The human brain receives nutrients and oxygen through an intricate network of blood vessels. Pathology affecting small vessels, at the mesoscopic scale, represents a critical vulnerability within the cerebral blood supply and can lead to severe conditions, such as Cerebral Small Vessel Diseases. The advent of 7 Tesla MRI systems has enabled the acquisition of higher spatial resolution images, making it possible to visualise such vessels in the brain. However, the lack of publicly available annotated datasets has impeded the development of robust, machine learning-driven segmentation algorithms. To address this, the SMILE-UHURA challenge was organised. This challenge, held in conjunction with the ISBI 2023, in Cartagena de Indias, Colombia, aimed to provide a platform for researchers working on related topics. The SMILE-UHURA challenge addresses the gap in publicly available annotated datasets by providing an annotated dataset of Time-of-Flight angiography acquired with 7T MRI. This dat
3 PAPERS • NO BENCHMARKS YET
This is the first general Underwater Image Instance Segmentation (UIIS) dataset containing 4,628 images for 7 categories with pixel-level annotations for underwater instance segmentation task
3 PAPERS • 1 BENCHMARK
Intracranial hemorrhage (ICH) is a pathological condition characterized by bleeding inside the skull or brain, which can be attributed to various factors. Identifying, localizing and quantifying ICH has important clinical implications, in a bleed-dependent manner. While deep learning techniques are widely used in medical image segmentation and have been applied to the ICH segmentation task, existing public ICH datasets do not support the multi-class segmentation problem. To address this, we develop the Brain Hemorrhage Segmentation Dataset (BHSD), which provides a 3D multi-class ICH dataset containing 192 volumes with pixel-level annotations and 2200 volumes with slice-level annotations across five categories of ICH. To demonstrate the utility of the dataset, we formulate a series of supervised and semi-supervised ICH segmentation tasks. We provide experimental results with state-of-the-art models as reference benchmarks for further model developments and evaluations on this dataset.
2 PAPERS • NO BENCHMARKS YET
The ULS23 test set contains 725 lesions from 284 patients of the Radboudumc and JBZ hospitals in the Netherlands. It is intended to be used to measure the performance of 3D universal lesion segmentation models for Computed Tomography (CT). To prepare the data, radiological reports from both participating institutions where searched using NLP tools identifying patients with measurable target lesions, indicating that these lesions were clinically relevant. A random sample of patients was selected, 56.3% of which were male and with diverse scanner manufacturers. The lesions were annotated in 3D by expert radiologists with over 10 years of experience in reading oncological scans. ULS23 is an open benchmark, and we invite ongoing submissions to advance the development of future ULS models.
2 PAPERS • 1 BENCHMARK
Underwater Trash Detection Dataset Overview The Underwater Trash Detection Dataset is a custom-annotated dataset designed to address the challenges of underwater trash detection caused by varying environmental features. Publicly available datasets alone are insufficient for training deep learning models due to domain-specific variations in underwater conditions. This dataset offers a cumulative, self-annotated collection of underwater images for detecting and classifying trash, providing a strong foundation for deep learning research and benchmark testing.
This dataset consists of annotated images and videos of smoke resulting from prescribed burning events in Finnish boreal forests. The dataset was created to train and validate learning-based methods for wildfire detection and smoke segmentation and its effectiveness in doing so was shown in the linked studies.
1 PAPER • NO BENCHMARKS YET
RGB-D instance segmentation box dataset. The Box-IS dataset was created to support research on human-robot collaboration with a focus on robotic manipulation tasks. It was captured using the Intel® RealSense™ Depth Camera D455, a high-performance sensor designed for depth imaging. To ensure precise depth measurements, we bypassed the default depth data processing of the sensor and performed accurate stereo matching directly from the captured left and right IR images. Employing the UniMatch technique, we derived a high-quality depth map from these stereo IR images, which was then aligned with the corresponding RGB image for a comprehensive output. The dataset was intentionally designed to encompass a broad range of scene complexities, from simple box arrangements to highly irregular configurations. This diversity ensures that it can effectively benchmark algorithms across varying levels of difficulty.
1 PAPER • 2 BENCHMARKS
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
The researchers of Qatar University have compiled the COVID-QU-Ex dataset, which consists of 33,920 chest X-ray (CXR) images including: * 11,956 COVID-19 * 11,263 Non-COVID infections (Viral or Bacterial Pneumonia) * 10,701 Normal Ground-truth lung segmentation masks are provided for the entire dataset. This is the largest ever created lung mask dataset.
This dataset contains images from Sentinel-2 satellites taken before and after a wildfire. The ground truth masks are provided by the California Department of Forestry and Fire Protection and they are mapped on the images. The dataset is designed to do binary semantic segmentation of burned vs unburned areas.
This dataset derives from Coil100. There are more than 1,1M images of 100 objects. Each object was turned on a turnable through 360 degrees to vary object pose with respect to a fixed color camera. Images of the objects were taken at pose intervals of 5 degrees. This corresponds to 72 poses per object. Then planar rotation (9 angles) and 18 scaling factors has been applied. Objects have a wide variety of complex geometric and reflectance characteristics.
A dataset of abdominal CT studies in NifTi format from the open-source medical data repository Medical Decathlon was utilized. To expedite the partitioning process, the MONAILabel plugin of the MONAI framework within the 3D Slicer program was employed. A radiologist with 15 years of experience conducted a validation process, wherein the boundaries of the colon markup were verified on each slice. The existing colorectal cancer markings in the dataset remained unaltered. Validation by a radiologist reduced the size of the validated dataset to 122 studies. In this case, the 122 studies were categorized into three subsets based on the quality of the data: The "good" subset comprises 100 studies, while the "bad" subset contains 17 cropped studies (in which the entire colon is not visible on the image). The "bad" subset comprises five studies. Two of these studies were of poor quality and could not identify the entire colon. Two further studies involved colon stomas following surgery, while
1 PAPER • 1 BENCHMARK
We introduce a new style- and category-agnostic floor plan image parsing benchmark developed in collaboration with professional architectural designers. This benchmark includes 25 categories of space and adjacency labels (19 space elements and 6 adjacency elements), offering a more diverse and comprehensive representation of common design elements across various graphical styles and design categories. It sets a new standard for the level of diversity and complexity of floor plan image parsing tasks oriented towards real-world applications, far exceeding the scope of existing datasets. This benchmark is available at https://doi.org/10.7910/DVN/MDIRHE.
This data set comprises 22 fundus images with their corresponding manual annotations for the blood vessels, separated as arteries and veins. It also include labels for glaucomatous / healthy, differentiating between normal tension glaucoma (NAG) and primary open angle glaucoma (POAG).
LLM-Seg40K dataset contains 14K images in total. The dataset is divided into training, validation, and test sets, containing 11K, 1K, and 2K images respectively. For the training split, each image has 3.95 questions on average and the average question question length is 15.2 words. The training set contains 1458 different categories in total.
During the covid-19 era wearing face masks posed new challenges to face-related tasks, including facial recognition, face inpainting, expression recognition, and object removal. Mask region segmentation is a preliminary stage to tackle the occlusion issue corresponding to the face-related tasks. Existing masked face datasets are not procedure binary segmentation maps because Segmenting mask regions manually is a time-consuming operation. As a result, existing unmasking methods; synthesize training data by overlaying masks on existing face datasets. However, since these techniques rely on an artificially generated mask, their effects tend to seem unnatural. To address this issue, the masked face segmentation dataset(MFSD) provides the first public training dataset for the mask segmentation task.
Pulmonary hypertension (PH) is a syndrome complex that accompanies a number of diseases of different etiologies, associated with basic mechanisms of structural and functional changes of the pulmonary circulation vessels and revealed pressure increasing in the pulmonary artery. The structural changes in the pulmonary circulation vessels are the main limiting factor determining the prognosis of patients with PH. Thickening and irreversible deposition of collagen in the pulmonary artery branches walls leads to rapid disease progression and a therapy effectiveness decreasing. In this regard, histological examination of the pulmonary circulation vessels is critical both in preclinical studies and clinical practice. However, measurements of quantitative parameters such as the average vessel outer diameter, the vessel walls area, and the hypertrophy index claimed significant time investment and the requirement for specialist training to analyze micrographs. A dataset of pulmonary circulation
The dataset is recorded with an on-vehicle ZED stereo camera in both urban and rural environments
A RGB-D dataset converted from NYUDv2 into COCO-style instance segmentation format. To construct NYUDv2-IS, specifically tailored for instance segmentation, we generated instance masks that delineate individual objects in each image. These masks were labeled using the object class annotations provided in the original NYUDv2 dataset, which is distributed in MATLAB format. The process involved several key steps: (1) extracting binary instance masks, (2) converting these masks into polygon representations, and (3) generating COCO-style annotations. Each annotation includes essential attributes such as category ID, segmentation masks, bounding boxes, object areas, and image metadata. During this conversion, we focused on 9 categories out of the original 13 classes, excluding non-instance categories such as walls and floors. To ensure dataset quality, images without any object annotations were systematically removed.
Late third instar wing imaginal discs were cultured in Shields and Sang M3 media (Sigma) supplemented with 2% FBS (Sigma), 1% pen/strep (Gibco), 3ng/ml ecdysone (Sigma) and 2ng/ml insulin (Sigma). Wing discs were cultured in 35mm fluorodishes (WPI) under 12mm filters (Millicell), as described in https://doi.org/10.1038%2Fs41567-019-0618-1
A RGB-D dataset converted from SUN-RGBD into COCO-style instance segmentation format. To transform SUN-RGBD into an instance segmentation benchmark (i.e., SUN-RGBDIS), we employed a pipeline similar to that of NYUDv2-IS. We selected 17 categories from the original 37 classes, carefully omitting non-instance categories like ceilings and walls. Images lacking any identifiable object instances were filtered out to maintain dataset relevance for instance segmentation tasks. We systematically convert segmentation annotations into COCO format, generating precise bounding boxes, instance masks, and object attributes.
Pre-training is a strong strategy for enhancing visual models to efficiently train them with a limited number of labeled images. In semantic segmentation, creating annotation masks requires an intensive amount of labor and time, and therefore, a large-scale pre-training dataset with semantic labels is quite difficult to construct. Moreover, what matters in semantic segmentation pre-training has not been fully investigated. In this paper, we propose the Segmentation Radial Contour DataBase (SegRCDB), which for the first time applies formula-driven supervised learning for semantic segmentation. SegRCDB enables pre-training for semantic segmentation without real images or any manual semantic labels. SegRCDB is based on insights about what is important in pre-training for semantic segmentation and allows efficient pre-training. Pre-training with SegRCDB achieved higher mIoU than the pre-training with COCO-Stuff for fine-tuning on ADE-20k and Cityscapes with the same number of training imag
We construct the first large-scale dataset, USIS10K, for the underwater salient instance segmentation task, which contains 10,632 images and pixel-level annotations of 7 categories. As far as we know, this is the largest salient instance segmentation dataset, and includes Class-Agnostic and Multi-Class labels simultaneously.
A fully synthetic dataset of drones generated using structured domain randomization. It contains multiple datasets generated using different styles: - Drones only - Drones and Birds - Generic Distractors - Realistic Distractors - Random Backgrounds