For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real images. Hypersim is a photorealistic synthetic dataset for holistic indoor scene understanding. It contains 77,400 images of 461 indoor scenes with detailed per-pixel labels and corresponding ground truth geometry.
61 PAPERS • 1 BENCHMARK
The dataset of Thermal Bridges on Building Rooftops (TBBR dataset) consists of annotated combined RGB and thermal drone images with a height map. All images were converted to a uniform format of 3000$\times$4000 pixels, aligned, and cropped to 2400$\times$3400 to remove empty borders.
2 PAPERS • 2 BENCHMARKS
InfraParis is a novel and versatile dataset supporting multiple tasks across three modalities: RGB, depth, and infrared. From the city to the suburbs, it contains a variety of styles in different areas of the greater Paris area, providing rich semantic information. InfraParis contains 7301 images with bounding boxes and full semantic (19 classes) annotations. We assess various state-of-the-art baseline techniques, encompassing models for the tasks of semantic segmentation, object detection, and depth estimation.
1 PAPER • NO BENCHMARKS YET