Action Classification

195 papers with code • 20 benchmarks • 26 datasets


Use these libraries to find Action Classification models and implementations

Latest papers with no code

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

no code yet • 29 Mar 2023

Finally, we successfully train a video ViT model with a billion parameters, which achieves a new state-of-the-art performance on the datasets of Kinetics (90. 0% on K400 and 89. 9% on K600) and Something-Something (68. 7% on V1 and 77. 0% on V2).

Multi-modal Prompting for Low-Shot Temporal Action Localization

no code yet • 21 Mar 2023

In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.

Classification of Primitive Manufacturing Tasks from Filtered Event Data

no code yet • 15 Mar 2023

Several filters are compared and combined to remove event data noise.

Scaling Vision Transformers to 22 Billion Parameters

no code yet • 10 Feb 2023

The scaling of Transformers has driven breakthrough capabilities for language models.

Deep Dependency Networks for Multi-Label Classification

no code yet • 1 Feb 2023

We propose a simple approach which combines the strengths of probabilistic graphical models and deep learning architectures for solving the multi-label classification task, focusing specifically on image and video data.

Actor-Director-Critic: A Novel Deep Reinforcement Learning Framework

no code yet • 10 Jan 2023

For the two critic networks used, we design two target critic networks for each critic network instead of one.

HierVL: Learning Hierarchical Video-Language Embeddings

no code yet • 5 Jan 2023

Video-language embeddings are a promising avenue for injecting semantics into visual representations, but existing methods capture only short-term associations between seconds-long video clips and their accompanying text.

Hierarchical Explanations for Video Action Recognition

no code yet • 1 Jan 2023

We propose Hierarchical ProtoPNet: an interpretable network that explains its reasoning process by considering the hierarchical relationship between classes.

Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations

no code yet • 6 Dec 2022

In this paper, we introduce a new framework of contrastive action representation learning (CARL) to learn frame-wise action representation in a self-supervised or weakly-supervised manner, especially for long videos.

Spatio-Temporal Crop Aggregation for Video Representation Learning

no code yet • 30 Nov 2022

We propose Spatio-temporal Crop Aggregation for video representation LEarning (SCALE), a novel method that enjoys high scalability at both training and inference time.