Weakly Supervised Action Localization
32 papers with code • 8 benchmarks • 5 datasets
In this task, the training data consists of videos with a list of activities in them without any temporal boundary annotations. However, while testing, given a video, the algorithm should recognize the activities in the video and also provide the start and end time.
Benchmarks
These leaderboards are used to track progress in Weakly Supervised Action Localization
Libraries
Use these libraries to find Weakly Supervised Action Localization models and implementationsLatest papers with no code
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization
This paper tackles the challenge of point-supervised temporal action detection, wherein only a single frame is annotated for each action instance in the training set.
Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling
To address this problem, we propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics.
PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization
To address this, we present PivoTAL, Prior-driven Supervision for Weakly-supervised Temporal Action Localization, to approach WTAL from a localization-by-localization perspective by learning to localize the action snippets directly.
Two-Stream Networks for Weakly-Supervised Temporal Action Localization With Semantic-Aware Mechanisms
In this paper, we hypothesize that snippets with similar representations should be considered as the same action class despite the absence of supervision signals on each snippet.
Action Shuffling for Weakly Supervised Temporal Localization
Weakly supervised action localization is a challenging task with extensive applications, which aims to identify actions and the corresponding temporal intervals with only video-level annotations available.
Action Unit Memory Network for Weakly Supervised Temporal Action Localization
In this paper, we present an Action Unit Memory Network (AUMN) for weakly supervised temporal action localization, which can mitigate the above two challenges by learning an action unit memory bank.
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.
ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization
In this paper, we introduce an Action-Context Separation Network (ACSNet) that explicitly takes into account context for accurate action localization.
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization
Temporally localizing actions in videos is one of the key components for video understanding.
Weakly-Supervised Action Localization and Action Recognition using Global-Local Attention of 3D CNN
The proposed approach intends to show the usefulness of every layer termed as global-local attention in 3D CNN via visual attribution, weakly-supervised action localization, and action recognition.