Action Recognition In Videos

64 papers with code • 17 benchmarks • 17 datasets

Action Recognition in Videos is a task in computer vision and pattern recognition where the goal is to identify and categorize human actions performed in a video sequence. The task involves analyzing the spatiotemporal dynamics of the actions and mapping them to a predefined set of action classes, such as running, jumping, or swimming.

Benchmarks

Add a Result

These leaderboards are used to track progress in Action Recognition In Videos

Dataset	Best Model	Compare
Jester (Gesture Recognition)	CPNet Res34, 5 CP	See all
UCF101	STM (ImageNet+Kinetics pretrain)	See all
Something-Something V2	CAST-B/16	See all
Something-Something V1	STM (16 frames, ImageNet pretraining)	See all
Kinetics-400	CAST-B/16	See all
PKU-MMD	MMNet	See all
Sports-1M	G-Blend	See all
FS-Something-Something V2-Small	ITANet	See all
FS-Something-Something V2-Full	ITANet	See all
THUMOS’14	Single-stream R-C3D (two-way buffer)	See all
AVA v2.2	YOWO+LFB*	See all
HMDB-51	STM (ImageNet+Kinetics pretrain)	See all
AVA v2.1	YOWO+LFB*	See all
Kinetics-600	Florence	See all
ActivityNet	LSTM + Pretrained on YT-8M	See all
NTU RGB+D	2D-3D-Softargmax (RGB only)	See all
miniSports	G-Blend	See all

Show all 17 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Action Recognition In Videos models and implementations

open-mmlab/mmaction2

4 papers

3,866

yjxiong/caffe

3 papers

548

towhee-io/towhee

2 papers

2,968

MichiganCOG/M-PACT

2 papers

106

See all 5 libraries.

Datasets

Subtasks

Action Anticipation

Latest papers with no code

Most implemented Social Latest No code

Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos

no code yet • 11 Apr 2024

These spatial features then undergo intermediate temporal modeling facilitated by the Mamba block before progressing to the encoder section, which comprises vanilla upsampling Shift S-GCN blocks.

Paper
Add Code

Deep Learning Approaches for Human Action Recognition in Video Data

no code yet • 11 Mar 2024

The results of this study underscore the potential of composite models in achieving robust human action recognition and suggest avenues for future research in optimizing these models for real-world deployment.

Paper
Add Code

DVANet: Disentangling View and Action Features for Multi-View Action Recognition

no code yet • 10 Dec 2023

In this work, we present a novel approach to multi-view action recognition where we guide learned action representations to be separated from view-relevant information in a video.

Paper
Add Code

Action Class Relation Detection and Classification Across Multiple Video Datasets

no code yet • 15 Aug 2023

The Meta Video Dataset (MetaVD) provides annotated relations between action classes in major datasets for human action recognition in videos.

Paper
Add Code

Knowledge Prompting for Few-shot Action Recognition

no code yet • 22 Nov 2022

Few-shot action recognition in videos is challenging for its lack of supervision and difficulty in generalizing to unseen actions.

Paper
Add Code

Could Giant Pretrained Image Models Extract Universal Representations?

no code yet • 3 Nov 2022

In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition.

Paper
Add Code

Class-Incremental Learning for Action Recognition in Videos

no code yet • ICCV 2021

We tackle catastrophic forgetting problem in the context of class-incremental learning for video recognition, which has not been explored actively despite the popularity of continual learning.

Paper
Add Code

Co-training Transformer with Videos and Images Improves Action Recognition

no code yet • 14 Dec 2021

We term this approach as Co-training Videos and Images for Action Recognition (CoVeR).

Paper
Add Code

Technical Report: Disentangled Action Parsing Networks for Accurate Part-level Action Parsing

no code yet • 5 Nov 2021

Despite of dramatic progresses in the area of video classification research, a severe problem faced by the community is that the detailed understanding of human actions is ignored.

Paper
Add Code

Class incremental learning for video action classification

no code yet • IEEE International Conference on Image Processing (ICIP) 2021

Class Incremental Learning (CIL) is a hot topic in machine learning for CNN models to learn new classes incrementally.

Paper
Add Code

Action Recognition In Videos

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result