5 dataset results for regression AND Images

Satlas is a remote sensing dataset and benchmark that is large in both breadth, featuring all of the aforementioned applications and more, as well as scale, comprising 290M labels under 137 categories and 7 label modalities.

5 PAPERS • NO BENCHMARKS YET

TTE-A&O (Travel Time Estimation: Abakan and Omsk)

The dataset includes two parts corresponding to the cities of Abakan (65524 nodes, 340012 edges) and Omsk (231688 nodes, 1149492 edges). Along with the road network graph, it includes trip records represented as sequences of visited nodes (making the dataset suitable both for path-blind and path-aware settings). There are two types of target values for a regression task: real travel time and real length of a trip.

2 PAPERS • 1 BENCHMARK

xView3-SAR

Unsustainable fishing practices worldwide pose a major threat to marine resources and ecosystems. Identifying vessels that do not show up in conventional monitoring systems---known as ``dark vessels''---is key to managing and securing the health of marine environments. With the rise of satellite-based synthetic aperture radar (SAR) imaging and modern machine learning (ML), it is now possible to automate detection of dark vessels day or night, under all-weather conditions. SAR images, however, require a domain-specific treatment and are not widely accessible to the ML community. Maritime objects (vessels and offshore infrastructure) are relatively small and sparse, challenging traditional computer vision approaches. We present the largest labeled dataset for training ML models to detect and characterize vessels and ocean structures in SAR imagery. xView3-SAR consists of nearly 1,000 analysis-ready SAR images from the Sentinel-1 mission that are, on average, 29,400-by-24,400 pixels each.

2 PAPERS • 1 BENCHMARK

Brightfield vs Fluorescent Staining Dataset

Differential fluorescent staining is an effective tool widely adopted for the visualization, segmentation and quantification of cells and cellular substructures as a part of standard microscopic imaging protocols. Incompatibility of staining agents with viable cells represents major and often inevitable limitations to its applicability in live experiments, requiring extraction of samples at different stages of experiment increasing laboratory costs. Accordingly, development of computerized image analysis methodology capable of segmentation and quantification of cells and cellular substructures from plain monochromatic images obtained by light microscopy without help of any physical markup techniques is of considerable interest. The enclosed set contains human colon adenocarcinoma Caco-2 cells microscopic images obtained under various imaging conditions with different viable vs non-viable cells fractions. Each field of view is provided in a three-fold representation, including phase-con

1 PAPER • NO BENCHMARKS YET

MineralImage5k (Benchmark for 5k raw mineral species recognition)

We present a comprehensive dataset comprising a vast collection of raw mineral samples for the purpose of mineral recognition. The dataset encompasses more than 5,000 distinct mineral species and incorporates subsets for zero-shot and few-shot learning. In addition to the samples themselves, some entries in the dataset are accompanied by supplementary natural language descriptions, size measurements, and segmentation masks. For detailed information on each sample, please refer to the minerals_full.csv file.

1 PAPER • NO BENCHMARKS YET

Datasets

5 dataset results for regression AND Images