Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

CVPR 2017 Joao CarreiraAndrew Zisserman

The paucity of videos in current action classification datasets (UCF-101 and HMDB-51) has made it difficult to identify good video architectures, as most methods obtain similar performance on existing small-scale benchmarks. This paper re-evaluates state-of-the-art architectures in light of the new Kinetics Human Action Video dataset... (read more)

PDF Abstract

Evaluation results from the paper


#3 best model for Action Classification on HMDB51 (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Compare
Action Classification Charades I3D MAP 32.9 # 6
Action Classification HMDB51 Two-stream I3D Accuracy 80.9 # 3
Skeleton Based Action Recognition J-HMDB I3D Accuracy 84.1 # 3
Action Classification Moments in Time I3D Top 1 Accuracy 29.51% # 4
Action Classification Moments in Time I3D Top 5 Accuracy 56.06% # 3
Action Recognition In Videos UCF101 Two-stream I3D (on pre-trained) 3-fold Accuracy 98.0 # 3
Action Recognition In Videos UCF101 Two-stream I3D 3-fold Accuracy 93.4 # 14