Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories (flat surfaces, humans, vehicles, constructions, objects, nature, sky, and void). The dataset consists of around 5000 fine annotated images and 20000 coarse annotated ones. Data was captured in 50 cities during several months, daytimes, and good weather conditions. It was originally recorded as video so the frames were manually selected to have the following features: large number of dynamic objects, varying scene layout, and varying background.
2,534 PAPERS • 37 BENCHMARKS
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features:
601 PAPERS • 18 BENCHMARKS
CamVid (Cambridge-driving Labeled Video Database) is a road/driving scene understanding database which was originally captured as five video sequences with a 960×720 resolution camera mounted on the dashboard of a car. Those sequences were sampled (four of them at 1 fps and one at 15 fps) adding up to 701 frames. Those stills were manually annotated with 32 classes: void, building, wall, tree, vegetation, fence, sidewalk, parking block, column/pole, traffic cone, bridge, sign, miscellaneous text, traffic light, sky, tunnel, archway, road, road shoulder, lane markings (driving), lane markings (non-driving), animal, pedestrian, child, cart luggage, bicyclist, motorcycle, car, SUV/pickup/truck, truck/bus, train, and other moving object
194 PAPERS • 3 BENCHMARKS
The Common Objects in COntext-stuff (COCO-stuff) dataset is a dataset for scene understanding tasks like semantic segmentation, object detection and image captioning. It is constructed by annotating the original COCO dataset, which originally annotated things while neglecting stuff annotations. There are 164k images in COCO-stuff dataset that span over 172 categories including 80 things, 91 stuff, and 1 unlabeled class.
171 PAPERS • 14 BENCHMARKS
Kvasir-SEG is an open-access dataset of gastrointestinal polyp images and corresponding segmentation masks, manually annotated by a medical doctor and then verified by an experienced gastroenterologist.
76 PAPERS • 3 BENCHMARKS
The KVASIR Dataset was released as part of the medical multimedia challenge presented by MediaEval. It is based on images obtained from the GI tract via an endoscopy procedure. The dataset is composed of images that are annotated and verified by medical doctors, and captures 8 different classes. The classes are based on three anatomical landmarks (z-line, pylorus, cecum), three pathological findings (esophagitis, polyps, ulcerative colitis) and two other classes (dyed and lifted polyps, dyed resection margins) related to the polyp removal process. Overall, the dataset contains 8,000 endoscopic images, with 1,000 image examples per class.
47 PAPERS • 3 BENCHMARKS
Consists of annotated frames containing GI procedure tools such as snares, balloons and biopsy forceps, etc. Beside of the images, the dataset includes ground truth masks and bounding boxes and has been verified by two expert GI endoscopists.
8 PAPERS • 2 BENCHMARKS
The dataset contains a Video capsule endoscopy dataset for polyp segmentation.
3 PAPERS • 1 BENCHMARK
The “Medico automatic polyp segmentation challenge” aims to develop computer-aided diagnosis systems for automatic polyp segmentation to detect all types of polyps (for example, irregular polyp, smaller or flat polyps) with high efficiency and accuracy. The main goal of the challenge is to benchmark semantic segmentation algorithms on a publicly available dataset, emphasizing robustness, speed, and generalization.
3 PAPERS • 1 BENCHMARK
A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying images from the GI tract into 23 distinct classes. The second task focuses on efficiant classification measured by the amount of time spent processing each image. The last task relates to automatcially segmenting polyps.
1 PAPER • 1 BENCHMARK
Large-scale and open-access LiDAR dataset intended for the evaluation of real-time semantic segmentation algorithms. In contrast to other large-scale datasets, HelixNet includes fine-grained data about the sensor's rotation and position, as well as the points' release time.
1 PAPER • 1 BENCHMARK