…Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations. This dataset has been widely used as a benchmark for object detection, semantic segmentation, and classification tasks.
138 PAPERS • 22 BENCHMARKS
…AVA Speech densely annotates audio-based speech activity in AVA v1.0 videos, and explicitly labels 3 background noise conditions, resulting in ~46K labeled segments spanning 45 hours of data.
94 PAPERS • 7 BENCHMARKS