Self-Supervised Action Recognition
21 papers with code • 6 benchmarks • 4 datasets
Our representations are learned using a contrastive loss, where two augmented clips from the same short video are pulled together in the embedding space, while clips from different videos are pushed away.
Ranked #1 on Self-Supervised Action Recognition on Kinetics-400 (using extra training data)
In particular, we explore how best to combine the modalities, such that fine-grained representations of the visual and audio modalities can be maintained, whilst also integrating text into a common embedding.
Ranked #1 on Self-Supervised Action Recognition on HMDB51
We analyze key properties of the approach that make it work, finding that the contrastive loss outperforms a popular alternative based on cross-view prediction, and that the more views we learn from, the better the resulting representation captures underlying scene semantics.
Ranked #39 on Self-Supervised Action Recognition on UCF101
The objective of this paper is visual-only self-supervised video representation learning.
Ranked #12 on Self-Supervised Action Recognition on UCF101 (finetuned)
The objective of this paper is self-supervised learning of spatio-temporal embeddings from video, suitable for human action recognition.
Ranked #24 on Self-Supervised Action Recognition on UCF101
With the proposed Inter-Intra Contrastive (IIC) framework, we can train spatio-temporal convolutional networks to learn video representations.
Ranked #6 on Self-supervised Video Retrieval on HMDB51
Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics
We conduct extensive experiments with C3D to validate the effectiveness of our proposed approach.
Ranked #36 on Self-Supervised Action Recognition on HMDB51
To the best of our knowledge, XDC is the first self-supervised learning method that outperforms large-scale fully-supervised pretraining for action recognition on the same architecture.
Ranked #1 on Self-Supervised Action Recognition on UCF101