Crowd Counting via Adversarial Cross-Scale Consistency Pursuit

Crowd counting or density estimation is a challenging task in computer vision due to large scale variations, perspective distortions and serious occlusions, etc. Existing methods generally suffers from two issues: 1) the model averaging effects in multi-scale CNNs induced by the widely adopted L2 regression loss; and 2) inconsistent estimation across different scaled inputs. To explicitly address these issues, we propose a novel crowd counting (density estimation) framework called Adversarial Cross-Scale Consistency Pursuit (ACSCP). On one hand, a U-net structural network is designed to generate density map from input patch, and an adversarial loss is employed to shrink the solution onto a realistic subspace, thus attenuating the blurry effects of density map estimation. On the other hand, we design a novel scale-consistency regularizer which enforces that the sum up of the crowd counts from local patches (i.e., small scale) is coherent with the overall count of their region union (i.e., large scale). The above losses are integrated via a joint training scheme, so as to help boost density estimation performance by further exploring the collaboration between both objectives. Extensive experiments on four benchmarks have well demonstrated the effectiveness of the proposed innovations as well as the superior performance over prior art.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Crowd Counting UCF CC 50 ACSCP MAE 291.0 # 13

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Crowd Counting ShanghaiTech A ACSCP MAE 75.7 # 27
Crowd Counting ShanghaiTech B ACSCP MAE 17.2 # 22
Crowd Counting WorldExpo’10 ACSCP Average MAE 7.5 # 4

Methods