HowMany-Qa is a object counting dataset. It is taken from the counting-specific union of VQA 2.0 (Goyal et al., 2017) and Visual Genome QA (Krishna et al., 2016).
5 PAPERS • 1 BENCHMARK
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
2 PAPERS • NO BENCHMARKS YET
To effectively evaluate OmniCount across open-vocabulary, supervised, and few-shot counting tasks, a dataset catering to a broad spectrum of visual categories and instances featuring various visual categories with multiple instances and classes per image is essential. The current datasets, primarily designed for object counting focusing on singular object categories like humans and vehicles, fall short for multi-label object counting tasks. Despite the presence of multi-class datasets like MS COCO, their utility is limited for counting due to the sparse nature of object appearance. Addressing this gap, we created a new dataset with 30,230 images spanning 191 diverse categories, including kitchen utensils, office supplies, vehicles, and animals. This dataset, featuring a wide range of object instance counts per image ranging from 1 to 160 and an average count of 10, bridges the existing void and establishes a benchmark for assessing counting models in varied scenarios.
1 PAPER • 2 BENCHMARKS