SIDD is an image denoising dataset containing 30,000 noisy images from 10 scenes under different lighting conditions using five representative smartphone cameras. Ground truth images are provided along with the noisy images.
240 PAPERS • 2 BENCHMARKS
Color BSD68 dataset for image denoising benchmarks is part of The Berkeley Segmentation Dataset and Benchmark. It is used for measuring image denoising algorithms performance. It contains 68 images.
138 PAPERS • 16 BENCHMARKS
Raindrop is a set of image pairs, where each pair contains exactly the same background scene, yet one is degraded by raindrops and the other one is free from raindrops. To obtain this, the images are captured through two pieces of exactly the same glass: one sprayed with water, and the other is left clean. The dataset consists of 1,119 pairs of images, with various background scenes and raindrops. They were captured with a Sony A6000 and a Canon EOS 60.
111 PAPERS • 1 BENCHMARK
Consists of 8,422 blurry and sharp image pairs with 65,784 densely annotated FG human bounding boxes.
93 PAPERS • 4 BENCHMARKS
The PIRM dataset consists of 200 images, which are divided into two equal sets for validation and testing. These images cover diverse contents, including people, objects, environments, flora, natural scenery, etc. Images vary in size, and are typically ~300K pixels in resolution.
32 PAPERS • 1 BENCHMARK
TinyPerson is a benchmark for tiny object detection in a long distance and with massive backgrounds. The images in TinyPerson are collected from the Internet. First, videos with a high resolution are collected from different websites. Second, images from the video are sampled every 50 frames. Then images with a certain repetition (homogeneity) are deleted, and the resulting images are annotated with 72,651 objects with bounding boxes by hand.
25 PAPERS • NO BENCHMARKS YET
Various documents dataset. Each of the 65 documents includes scanned ground truth images, both hard and easy distorted photos, and document-centered cropped images.
23 PAPERS • 3 BENCHMARKS
An image restoration dataset
19 PAPERS • 1 BENCHMARK
A large-scale dataset of ~29.5K rain/rain-free image pairs that covers a wide range of natural rain scenes.
11 PAPERS • NO BENCHMARKS YET
The first ultra-high-definition image demoireing dataset, consisting of 4,500 4K resolution training pairs and 500 standard 4K resolution validation pairs.
2 PAPERS • 1 BENCHMARK
The existing multi-modality image fusion dataset lacks comprehensive coverage of adverse weather scenarios. To address this, we introduce AWMM-100k, a benchmark dataset constructed by selecting samples from RoadScene, MSRS, M3FD, and LLVIP, followed by controlled degradation processing to simulate adverse weather conditions. Combined with real-world data captured using a DJI M30T drone equipped with high-resolution visible and thermal cameras, AWMM-100k comprises 187,699 images covering rain, haze, and snow, each categorized into heavy, medium, and light intensities. This dataset supports research on multi-modality image fusion under challenging weather conditions and is also applicable to image restoration tasks such as dehazing, deraining, and desnowing. We thank the original dataset for its contribution. In addition, we believe this dataset significantly expands the scope of multimodal image processing and computer vision research, facilitating advancements in both image fusion and
1 PAPER • NO BENCHMARKS YET
Synthetic training set: This set is constructed in the following two steps and will be used for estimation/training purposes. i) 84,000 275 pixel x 400 pixel ground-truth fingerprint images without any noise or scratches, but with random transformations (at most five pixels translation and +/-10 degrees rotation) were generated by using the software Anguli: Synthetic Fingerprint Generator. ii) 84,000 275 pixel x 400 pixel degraded fingerprint images were generated by applying random artifacts (blur, brightness, contrast, elastic transformation, occlusion, scratch, resolution, rotation) and backgrounds to the ground-truth fingerprint images. In total, it contains 168,000 fingerprint images (84,000 fingerprints, and two impressions - one ground-truth and one degraded - per fingerprint).
HAC is a dataset for learning and benchmarking arbitrary Hybrid Adverse Conditions restoration. HAC contains 31 scenarios composed of an arbitrary combination of five common weather, with a total of 316K adverse-weather/clean pairs.
The HRI Dataset comprises a total of 3,200 image pairs. Each image pair comprises a clean background image, a depth image, a rain layer mask image, and a rainy image. It contains three scenes: lane, citystreet and japanesestreet, with image resolutions of $2048\times1024$. The lane scene contains 1,600 image pairs, consisting of images from 4 camera viewpoints, each viewpoint containing 100 images of different moments, and each moment containing 4 different intensities of rainy scenes. The citystreet scene contains 600 image pairs, consisting of images from 6 camera viewpoints, each viewpoint containing 25 images of different moments, and each moment containing 4 different intensities of rainy scenes. The japanesestreet scene contains 1,000 image pairs, consisting of images from 10 camera viewpoints, each viewpoint containing 25 images of different moments, and each moment containing 4 different intensities of rainy scenes.
The Sentinel-2 satellite carries 12 CMOS detectors for the VNIR bands, with adjacent detectors having overlapping fields of view that result in overlapping regions in level-1 B (L1B) images. This dataset includes 3740 pairs of overlapping image crops extracted from two L1B products. Each crop has a height of around 400 pixels and a variable width that depends on the overlap width between detectors for RGBN bands, typically around 120-200 pixels. In addition to detector parallax, there is also cross-band parallax for each detector, resulting in shifts between bands. Pre-registration is performed for both cross-band and cross-detector parallax, with a precision of up to a few pixels (typically less than 10 pixels).
The Raw Natural Image Noise Dataset (RawNIND) is a diverse collection of paired raw images designed to support the development of denoising models that generalize across sensors, image development workflows, and styles.
Smartphone cameras are ubiquitous in daily life, yet their performance can be severely impacted by dirty lenses, leading to degraded image quality. This issue is often overlooked in image restoration research, which assumes ideal or controlled lens conditions. To address this gap, we introduced SIDL (Smartphone Images with Dirty Lenses), a novel dataset designed to restore images captured through contaminated smartphone lenses. SIDL contains diverse real-world images taken under various lighting conditions and environments. These images feature a wide range of lens contaminants, including water drops, fingerprints, and dust. Each contaminated image is paired with a clean reference image, enabling supervised learning approaches for restoration tasks. To evaluate the challenge posed by SIDL, various state-of-the-art restoration models were trained and compared on this dataset. Their performances achieved some level of restoration but did not adequately address the diverse and reali
Contains three difficult real-world scenarios: uncontrolled videos taken by UAVs and manned gliders, as well as controlled videos taken on the ground. Over 160,000 annotated frames forhundreds of ImageNet classes are available, which are used for baseline experiments that assess the impact of known and unknown image artifacts and other conditions on common deep learning-based object classification approaches.