Few Shot Action Recognition

Few-shot (FS) action recognition is a challenging com- puter vision problem, where the task is to classify an unlabelled query video into one of the action categories in the support set having limited samples per action class.

Temporal-Relational CrossTransformers for Few-Shot Action Recognition

CVPR 2021

We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set.

Action Genome: Actions as Composition of Spatio-temporal Scene Graphs

15 Dec 2019

Next, by decomposing and learning the temporal changes in visual relationships that result in an action, we demonstrate the utility of a hierarchical event decomposition by enabling few-shot action recognition, achieving 42. 7% mAP using as few as 10 examples.

Few-shot Action Recognition with Permutation-invariant Attention

ECCV 2020

Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependencies whose patterns are unlikely to repeat even in clips of the same class.

Depth Guided Adaptive Meta-Fusion Network for Few-shot Video Recognition

20 Oct 2020

Humans can easily recognize actions with only a few examples given, while the existing video recognition models still heavily rely on the large-scale labeled data inputs.

Few-shot Action Recognition with Prototype-centered Attentive Learning

20 Jan 2021

Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

Home Action Genome: Cooperative Compositional Action Understanding

CVPR 2021

However, there remains a lack of studies that extend action composition and leverage multiple viewpoints and multiple modalities of data for representation learning.

TA2N: Two-Stage Action Alignment Network for Few-shot Action Recognition

10 Jul 2021

The first stage locates the action by learning a temporal affine transform, which warps each video feature to its action duration while dismissing the action-irrelevant feature (e. g. background).

A New Split for Evaluating True Zero-Shot Action Recognition

27 Jul 2021

We benchmark several recent approaches on the proposed True Zero-Shot(TruZe) Split for UCF101 and HMDB51, with zero-shot and generalized zero-shot evaluation.

Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification

ICLR 2022

Explainable distances for sequence data depend on temporal alignment to tackle sequences with different lengths and local variances.

Revisiting spatio-temporal layouts for compositional action recognition

2 Nov 2021

Recognizing human actions is fundamentally a spatio-temporal reasoning problem, and should be, at least to some extent, invariant to the appearance of the human and the objects involved.