|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
In crowd counting datasets, each person is annotated by a point, which is usually the center of the head.
Then to relate the density maps between neighbouring frames, a Locality-constrained Spatial Transformer (LST) module is introduced to estimate the density map of next frame with that of current frame.
This technical report attempts to provide efficient and solid kits addressed on the field of crowd counting, which is denoted as Crowd Counting Code Framework (C$^3$F).
We introduce a detection framework for dense crowd counting and eliminate the need for the prevalent density regression paradigm.
SOTA for Crowd Counting on ShanghaiTech A
Crowd counting from a single image is a challenging task due to high appearance similarity, perspective changes and severe congestion.
Extensive experiments demonstrate that our network produces much better results on unseen datasets compared with existing counting adaption models.
Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting.
The task of crowd counting in varying density scenes is an extremely difficult challenge due to large scale variations.
In this work, we explore the cross-scale similarity in crowd counting scenario, in which the regions of different scales often exhibit high visual similarity.