ECO: Efficient Convolutional Network for Online Video Understanding

ECCV 2018 Mohammadreza ZolfaghariKamaljeet SinghThomas Brox

The state of the art in video understanding suffers from two problems: (1) The major part of reasoning is performed locally in the video, therefore, it misses important relationships within actions that span several seconds. (2) While there are local methods with fast per-frame processing, the processing of the whole video is not efficient and hampers fast video retrieval or online classification of long-term activities... (read more)

Evaluation results from the paper

#11 best model for Action Recognition In Videos on Something-Something V1 (using extra training data)

Task Dataset Model Metric name Metric value Global rank Uses extra
training data
Action Recognition In Videos Something-Something V1 ECO-Net (ImageNet pretrained) Top 1 Accuracy 46.4 # 11