Learn to cycle: Time-consistent feature discovery for action recognition

15 Jun 2020  ·  Alexandros Stergiou, Ronald Poppe ·

Generalizing over temporal variations is a prerequisite for effective action recognition in videos. Despite significant advances in deep neural networks, it remains a challenge to focus on short-term discriminative motions in relation to the overall performance of an action. We address this challenge by allowing some flexibility in discovering relevant spatio-temporal features. We introduce Squeeze and Recursion Temporal Gates (SRTG), an approach that favors inputs with similar activations with potential temporal variations. We implement this idea with a novel CNN block that uses an LSTM to encapsulate feature dynamics, in conjunction with a temporal gate that is responsible for evaluating the consistency of the discovered dynamics and the modeled features. We show consistent improvement when using SRTG blocks, with only a minimal increase in the number of GFLOPs. On Kinetics-700, we perform on par with current state-of-the-art models, and outperform these on HACS, Moments in Time, UCF-101 and HMDB-51.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Action Recognition HACS SRTG r3d-34 Top 1 Accuracy 78.60 # 7
Top 5 Accuracy 93.57 # 7
Action Recognition HACS SRTG r3d-50 Top 1 Accuracy 80.36 # 6
Top 5 Accuracy 95.55 # 5
Action Recognition HACS SRTG r(2+1)d-101 Top 1 Accuracy 84.33 # 2
Top 5 Accuracy 96.85 # 2
Action Recognition HACS SRTG r3d-101 Top 1 Accuracy 81.66 # 4
Top 5 Accuracy 96.33 # 4
Action Recognition HACS SRTG r(2+1)d-34 Top 1 Accuracy 80.39 # 5
Top 5 Accuracy 94.27 # 6
Action Recognition HACS SRTG r(2+1)d-50 Top 1 Accuracy 83.77 # 3
Top 5 Accuracy 96.56 # 3
Action Classification Kinetics-700 SRTG r(2+1)d-50 Top-1 Accuracy 54.17 # 24
Top-5 Accuracy 74.62 # 13
Action Classification Kinetics-700 SRTG r3d-34 Top-1 Accuracy 49.15 # 28
Top-5 Accuracy 72.68 # 16
Action Classification Kinetics-700 SRTG r3d-50 Top-1 Accuracy 53.52 # 25
Top-5 Accuracy 74.17 # 14
Action Classification Kinetics-700 SRTG r(2+1)d-34 Top-1 Accuracy 49.43 # 27
Top-5 Accuracy 73.23 # 15
Action Classification Kinetics-700 SRTG r3d-101 Top-1 Accuracy 56.46 # 23
Top-5 Accuracy 76.82 # 12
Action Classification Moments in Time SRTG r3d-101 Top 1 Accuracy 33.56 # 16
Top 5 Accuracy 58.49 # 11
Action Classification Moments in Time SRTG r3d-34 Top 1 Accuracy 28.55 # 24
Top 5 Accuracy 52.35 # 17
Action Classification Moments in Time SRTG r3d-50 Top 1 Accuracy 30.72 # 21
Top 5 Accuracy 55.65 # 14
Action Classification Moments in Time SRTG r(2+1)d-50 Top 1 Accuracy 31.60 # 20
Top 5 Accuracy 56.80 # 12
Action Classification Moments in Time SRTG r(2+1)d-34 Top 1 Accuracy 28.97 # 23
Top 5 Accuracy 54.18 # 15

Methods