CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

CVPR 2018  ·  Yuhong Li, Xiaofan Zhang, Deming Chen ·

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations... CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach. read more

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Crowd Counting ShanghaiTech A CSRNet MAE 68.2 # 8
Crowd Counting ShanghaiTech B CSRNet MAE 10.6 # 8
Crowd Counting UCF CC 50 CSRNet MAE 266.1 # 8
Crowd Counting WorldExpo’10 CSRNet Average MAE 8.6 # 6

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Crowd Counting Venice CSRNet MAE 35.8 # 3

Methods


No methods listed for this paper. Add relevant methods here