Anomalous Event Recognition in Videos Based on Joint Learningof Motion and Appearance with Multiple Ranking Measures

Given the scarcity of annotated datasets, learning the context-dependency of anomalous events as well as mitigating false alarms represent challenges in the task of anomalous activity detection. We propose a framework, Deep-network with Multiple Ranking Measures(DMRMs), which addresses context-dependency using a joint learning technique for motion and appearance features. In DMRMs, the spatial-time-dependent features are extracted from a video using a 3Dresidual network(ResNet), and deep motion features are extracted by integrating the Motionflow maps’ information with the 3D ResNet. Afterward, the extracted features are fused for joint learning. This data fusion is then passed through a deep neural network for deep multiple instance learning(DMIL)to learn the context-dependency in a weakly-supervised manner using the proposed multiple ranking measures(MRMs). These MRMs consider multiple measures of false alarms, and the network is trained with both normal and anomalous events, thus lowering the false alarm rate. Meanwhile, in the inference phase, the network predicts each frame’s abnormality score along with the localization of moving objects using motion flow maps. A higher abnormality score indicates the presence of an anomalous event. Experimental results on two recent and challenging datasets demonstrate that our proposed framework improves the area under the curve(AUC)score by6.5%compared to the state-of-the-art method on the UCF-Crime dataset and shows AUC of68.5%on theShanghaiTech dataset.

PDF

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Anomaly Detection In Surveillance Videos UCF-Crime DMRMs ROC AUC 81.91 # 10
Decidability - # 3
EER - # 3

Methods