A Closer Look at Spatiotemporal Convolutions for Action Recognition

CVPR 2018 Du TranHeng WangLorenzo TorresaniJamie RayYann LeCunManohar Paluri

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have remained solid performers in action recognition... (read more)

PDF Abstract

Evaluation Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK COMPARE
Action Recognition In Videos Sports-1M R(2+1)D-RGB-32frame Clip [email protected] 57 # 1
Action Recognition In Videos Sports-1M R(2+1)D-RGB-32frame Video [email protected] 73 # 4
Action Recognition In Videos Sports-1M R(2+1)D-RGB-32frame Video [email protected] 91.5 # 4
Action Recognition In Videos Sports-1M R(2+1)D-Two-Stream-32frame Video [email protected] 73.3 # 3
Action Recognition In Videos Sports-1M R(2+1)D-Two-Stream-32frame Video [email protected] 91.9 # 3