Improved Knowledge Distillation for Crowd Counting on IoT Device

Manual crowd counting for real-world problems is impossible or results in wildly inaccurate estimations. Deep learning is one area that has been applied to address this issue. Crowd counting is a computationally intensive task. Therefore, many crowd counting models employ large-scale deep convolutional neural networks (CNN) to achieve higher accuracy. However, these are typically at the cost of performance and inference speed. This makes such approaches difficult to apply in real-world settings, e.g., on Internet-of-Things (IoT) devices. To tackle this problem, one method is to compress models using pruning and quantization or use of lightweight model backbones. However, such methods often result in a significant loss of accuracy. To address this, some studies have explored knowledge distillation methods to extract useful information from large state-of-the-art (teacher) models to guide/train smaller (student) models. However, knowledge distillation methods suffer from the problem of information loss caused by hint-transformers. Furthermore, teacher models may have a negative impact on student models. In this work, we propose a method based on knowledge distillation that uses self-transformed hints and loss functions that ignore outliers to tackle real-world and challenging crowd counting tasks. Based on our approach, we achieve a MAE of 77.24 and a MSE of 276.17 using the JHU-CROWD++ [1] test set. This is comparable to state-of-the-art deep crowd counting models, but at a fraction of the original model size and complexity, thus making the solution suitable for IoT devices. The source code is available at https://github.com/huangzuo/effcc_distilled.

PDF

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Crowd Counting JHU-CROWD++ EffCC-Lite0.5 MAE 77.24 # 1
MSE 276.17 # 1

Methods