Action Localization

135 papers with code • 0 benchmarks • 3 datasets

Action Localization is finding the spatial and temporal co ordinates for an action in a video. An action localization model will identify which frame an action start and ends in video and return the x,y coordinates of an action. Further the co ordinates will change when the object performing action undergoes a displacement.

Libraries

Use these libraries to find Action Localization models and implementations

Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization

RenHuan1999/CVPR2023_P-MIL CVPR 2023

Weakly-supervised temporal action localization aims to localize and recognize actions in untrimmed videos with only video-level category labels during training.

26
29 May 2023

Boosting Weakly-Supervised Temporal Action Localization with Text Information

lgzlilili/boosting-wtal CVPR 2023

For the discriminative objective, we propose a Text-Segment Mining (TSM) mechanism, which constructs a text description based on the action class label, and regards the text as the query to mine all class-related segments.

33
01 May 2023

Weakly-Supervised Temporal Action Localization with Bidirectional Semantic Consistency Constraint

lgzlilili/biscc 25 Apr 2023

The proposed Bi-SCC firstly adopts a temporal context augmentation to generate an augmented video that breaks the correlation between positive actions and their co-scene actions in the inter-video; Then, a semantic consistency constraint (SCC) is used to enforce the predictions of the original video and augmented video to be consistent, hence suppressing the co-scene actions.

7
25 Apr 2023

Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels

zhou745/gaufuse_wstal CVPR 2023

Besides, the generated pseudo-labels can be fluctuating and inaccurate at the early stage of training.

19
17 Apr 2023

WEAR: An Outdoor Sports Dataset for Wearable and Egocentric Activity Recognition

mariusbock/wear 11 Apr 2023

Though research has shown the complementarity of camera- and inertial-based data, datasets which offer both egocentric video and inertial-based sensor data remain scarce.

6
11 Apr 2023

TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization

tuantng/temporalmaxer 16 Mar 2023

To this end, we introduce TemporalMaxer, which minimizes long-term temporal context modeling while maximizing information from the extracted video clip features with a basic, parameter-free, and local region operating max-pooling block.

47
16 Mar 2023

Faster Learning of Temporal Action Proposal via Sparse Multilevel Boundary Generator

zhouyang-001/smbg-for-temporal-action-proposal 6 Mar 2023

Temporal action localization in videos presents significant challenges in the field of computer vision.

0
06 Mar 2023

Chaotic World: A Large and Challenging Benchmark for Human Behavior Understanding in Chaotic Events

sutdcv/chaotic-world ICCV 2023

Understanding and analyzing human behaviors (actions and interactions of people), voices, and sounds in chaotic events is crucial in many applications, e. g., crowd management, emergency response services.

8
01 Jan 2023

Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Localization

kunnxia/npl ICCV 2023

To this end, we propose a unified framework, termed Noisy Pseudo-Label Learning, to handle both location biases and category errors.

4
01 Jan 2023

Boosting Positive Segments for Weakly-Supervised Audio-Visual Video Parsing

kranthikumarr/poibin ICCV 2023

To address this, we focus on improving the proportion of positive segments detected in a video.

1
01 Jan 2023