Temporally Coherent Embeddings for Self-Supervised Video Representation Learning

21 Mar 2020Joshua KnightsAnthony VanderkopDaniel WardOlivia Mackenzie-RossPeyman Moghadam

This paper presents TCE: Temporally Coherent Embeddings for self-supervised video representation learning. The proposed method exploits inherent structure of unlabeled video data to explicitly enforce temporal coherency in the embedding space, rather than indirectly learning it through ranking or predictive pretext tasks... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Self-Supervised Action Recognition UCF101 TCE (ResNet-101) 3-fold Accuracy 67.79 # 3
Self-Supervised Action Recognition UCF101 TCE (ResNet-50) 3-fold Accuracy 66.64 # 4