Few Shot Action Recognition

25 papers with code • 4 benchmarks • 5 datasets

Few-shot (FS) action recognition is a challenging com- puter vision problem, where the task is to classify an unlabelled query video into one of the action categories in the support set having limited samples per action class.

Most implemented papers

Revisiting spatio-temporal layouts for compositional action recognition

gorjanradevski/revisiting-spatial-temporal-layouts 2 Nov 2021

Recognizing human actions is fundamentally a spatio-temporal reasoning problem, and should be, at least to some extent, invariant to the appearance of the human and the objects involved.

Spatio-temporal Relation Modeling for Few-shot Action Recognition

Anirudh257/strm CVPR 2022

Experiments are performed on four few-shot action recognition benchmarks: Kinetics, SSv2, HMDB51 and UCF101.

Multi-level Second-order Few-shot Learning

hongguangzhang/mlso-tmm-master 15 Jan 2022

The goal of multi-level feature design is to extract feature representations at different layer-wise levels of CNN, realizing several levels of visual abstraction to achieve robust few-shot learning.

Hybrid Relation Guided Set Matching for Few-shot Action Recognition

alibaba-mmai-research/HyRSM CVPR 2022

To overcome the two limitations, we propose a novel Hybrid Relation guided Set Matching (HyRSM) approach that incorporates two key components: hybrid relation module and set matching metric.

Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition

R00Kie-Liu/Sampler 20 Jul 2022

In this paper, we propose a novel video frame sampler for few-shot action recognition to address this issue, where task-specific spatial-temporal frame sampling is achieved via a temporal selector (TS) and a spatial amplifier (SA).

Uncertainty-DTW for Time Series and Sequences

leiwangr/udtw 30 Oct 2022

Dynamic Time Warping (DTW) is used for matching pairs of sequences and celebrated in applications such as forecasting the evolution of time series, clustering time series or even matching sequence pairs in few-shot action recognition.

TempCLR: Temporal Alignment Representation with Contrastive Learning

yyuncong/tempclr 28 Dec 2022

For long videos, given a paragraph of description where the sentences describe different segments of the video, by matching all sentence-clip pairs, the paragraph and the full video are aligned implicitly.

HyRSM++: Hybrid Relation Guided Temporal Set Matching for Few-shot Action Recognition

alibaba-mmai-research/hyrsmplusplus 9 Jan 2023

To be specific, HyRSM++ consists of two key components, a hybrid relation module and a temporal set matching metric.

CLIP-guided Prototype Modulating for Few-shot Action Recognition

alibaba-mmai-research/clip-fsar 6 Mar 2023

Learning from large-scale contrastive language-image pre-training like CLIP has shown remarkable success in a wide range of downstream tasks recently, but it is still under-explored on the challenging few-shot action recognition (FSAR) task.

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

wlin-at/maxi ICCV 2023

We adapt a VL model for zero-shot and few-shot action recognition using a collection of unlabeled videos and an unpaired action dictionary.