Learning-to-Count by Learning-to-Rank: Weakly Supervised Object Counting & Localization Using Only Pairwise Image Rankings

29 Sep 2021 · Adriano C. D'Alessandro, Ali Mahdavi Amiri, Ghassan Hamarneh ·

Object counting and localization in dense scenes is a challenging class of image analysis problems that typically requires labour intensive annotations to learn to solve. We propose a form of weak supervision that only requires object-based pairwise image rankings. These annotations can be collected rapidly with a single click per image pair and supply a weak signal for object quantity. However, the problem of actually extracting object counts and locations from rankings is challenging. Thus, we introduce adversarial density map generation, a strategy for regularizing the features of a ranking network such that the features correspond to an object proposal map where each proposal must be a Gaussian blob that integrates to 1. This places a soft integer and soft localization constraint on the representation, which encourages the network to satisfy the provided ranking constraints by detecting objects. We then demonstrate the effectiveness of our method for exploiting pairwise image rankings as a weakly supervised signal for object counting and localization on several datasets, and show results with a performance that approaches that of fully supervised methods on many counting benchmark datasets while relying on data that can be collected with a fraction of the annotation burden.

PDF Abstract