Temporal Action Localization
421 papers with code • 14 benchmarks • 42 datasets
Temporal Action Localization aims to detect activities in the video stream and output beginning and end timestamps. It is closely related to Temporal Action Proposal Generation.
Libraries
Use these libraries to find Temporal Action Localization models and implementationsDatasets
Subtasks
Latest papers with no code
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
This paper introduces a novel approach to temporal action localization (TAL) in few-shot learning.
Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes
To this end, we first devise innovative strategies to adaptively select high-quality positive and negative classes from the label space, by modeling both the confidence and rank of a class in relation to those of the target class.
Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines
We proposed a method of converting hand action recognition problems into hand skeletal trajectory classification problems, which solved the real-time performance problem of industrial algorithms.
BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-Trainin
Skeleton-based motion representations are robust for action localization and understanding for their invariance to perspective, lighting, and occlusion, compared with images.
Deep Learning Approaches for Human Action Recognition in Video Data
The results of this study underscore the potential of composite models in achieving robust human action recognition and suggest avenues for future research in optimizing these models for real-world deployment.
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Temporal localization of driving actions plays a crucial role in advanced driver-assistance systems and naturalistic driving studies.
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
The model uses 2D-pose features as the positional embedding of the transformer architecture and spatio-temporal features as the main input to the encoder of the transformer.
TikTokActions: A TikTok-Derived Video Dataset for Human Action Recognition
We find that the performance of the model pre-trained using our Tik-Tok dataset is comparable to models trained on larger action recognition datasets (95. 3% on UCF101 and 53. 24% on HMDB51).
Active Generation Network of Human Skeleton for Action Recognition
To solve those problems, We propose a novel active generative network (AGN), which can adaptively learn various action categories by motion style transfer to generate new actions when the data for a particular action is only a single sample or few samples.
Cutup and Detect: Human Fall Detection on Cutup Untrimmed Videos Using a Large Foundational Video Understanding Model
The results are promising for real-time application, and the falls are detected on video level with a state-of-the-art 0. 96 F1 score on the HQFSD dataset under the given experimental settings.