Evolving Space-Time Neural Architectures for Videos

26 Nov 2018AJ PiergiovanniAnelia AngelovaAlexander ToshevMichael S. Ryoo

We present a new method for finding video CNN architectures that capture rich spatio-temporal information in videos. Previous work, taking advantage of 3D convolutions, obtained promising results by manually designing video CNN architectures... (read more)

PDF Abstract

Evaluation results from the paper

 SOTA for Action Classification on HMDB51 (using extra training data)

Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Action Classification Charades EvaNet MAP 38.1 # 5
Action Classification HMDB51 EvaNet Accuracy 82.1 # 1
Action Classification Kinetics-400 EvaNet Accuracy 77.4 # 7
Action Classification Moments in Time EvaNet Top 1 Accuracy 31.8% # 2