Actionness Estimation Using Hybrid Fully Convolutional Networks

CVPR 2016  ·  Limin Wang, Yu Qiao, Xiaoou Tang, Luc van Gool ·

Actionness was introduced to quantify the likelihood of containing a generic action instance at a specific location. Accurate and efficient estimation of actionness is important in video analysis and may benefit other relevant tasks such as action recognition and action detection. This paper presents a new deep architecture for actionness estimation, called hybrid fully convolutional network (H-FCN), which is composed of appearance FCN (A-FCN) and motion FCN (M-FCN). These two FCNs leverage the strong capacity of deep models to estimate actionness maps from the perspectives of static appearance and dynamic motion, respectively. In addition, the fully convolutional nature of H-FCN allows it to efficiently process videos with arbitrary sizes. Experiments are conducted on the challenging datasets of Stanford40, UCF Sports, and JHMDB to verify the effectiveness of H-FCN on actionness estimation, which demonstrate that our method achieves superior performance to previous ones. Moreover, we apply the estimated actionness maps on action proposal generation and action detection. Our actionness maps advance the current state-of-the-art performance of these tasks substantially.

PDF Abstract CVPR 2016 PDF CVPR 2016 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Action Detection J-HMDB Actionnness Video-mAP 0.5 56.4 # 15
Action Detection J-HMDB Actionness Frame-mAP 0.5 39.9 # 11


No methods listed for this paper. Add relevant methods here