Busy-Quiet Video Disentangling for Video Classification

29 Mar 2021  ·  Guoxi Huang, Adrian G. Bors ·

In video data, busy motion details from moving regions are conveyed within a specific frequency bandwidth in the frequency domain. Meanwhile, the rest of the frequencies of video data are encoded with quiet information with substantial redundancy, which causes low processing efficiency in existing video models that take as input raw RGB frames. In this paper, we consider allocating intenser computation for the processing of the important busy information and less computation for that of the quiet information. We design a trainable Motion Band-Pass Module (MBPM) for separating busy information from quiet information in raw video data. By embedding the MBPM into a two-pathway CNN architecture, we define a Busy-Quiet Net (BQN). The efficiency of BQN is determined by avoiding redundancy in the feature space processed by the two pathways: one operating on Quiet features of low-resolution, while the other processes Busy features. The proposed BQN outperforms many recent video processing models on Something-Something V1, Kinetics400, UCF101 and HMDB51 datasets.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Action Recognition HMDB-51 BQN Average accuracy of 3 splits 77.6 # 30
Action Classification Kinetics-400 BQN (ResNet-50) Acc@1 77.3 # 136
Acc@5 93.2 # 97
Action Recognition Something-Something V1 BQNEn (ImageNet + K400 pretrained) Top 1 Accuracy 57.1 # 16
Top 5 Accuracy 84.2 # 9
Action Recognition UCF101 BQN 3-fold Accuracy 97.6 # 15

Methods


No methods listed for this paper. Add relevant methods here