Weakly-supervised Temporal Action Localization
32 papers with code • 2 benchmarks • 2 datasets
Temporal Action Localization with weak supervision where only video-level labels are given for training
Libraries
Use these libraries to find Weakly-supervised Temporal Action Localization models and implementationsLatest papers with no code
STAT: Towards Generalizable Temporal Action Localization
To address this problem, we propose the Generalizable Temporal Action Localization task (GTAL), which focuses on improving the generalizability of action localization methods.
Sub-action Prototype Learning for Point-level Weakly-supervised Temporal Action Localization
Point-level weakly-supervised temporal action localization (PWTAL) aims to localize actions with only a single timestamp annotation for each action instance.
Cross-Video Contextual Knowledge Exploration and Exploitation for Ambiguity Reduction in Weakly Supervised Temporal Action Localization
Further, the GKSA module is used to efficiently summarize and propagate the cross-video representative action knowledge in a learnable manner to promote holistic action patterns understanding, which in turn allows the generation of high-confidence pseudo-labels for self-learning, thus alleviating ambiguity in temporal localization.
Video-Specific Query-Key Attention Modeling for Weakly-Supervised Temporal Action Localization
To better learn these action category queries, we exploit not only the features of the current input video but also the correlation between different videos through a novel video-specific action category query learner worked with a query similarity loss.
JCDNet: Joint of Common and Definite phases Network for Weakly Supervised Temporal Action Localization
These different actions are defined as conjoint actions, whose rest parts are definite phases, e. g., leaping over the bar in a HighJump.
Unified Keypoint-based Action Recognition Framework via Structured Keypoint Pooling
A point cloud deep-learning paradigm is introduced to the action recognition, and a unified framework along with a novel deep neural network architecture called Structured Keypoint Pooling is proposed.
Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Weakly-supervised temporal action localization aims to locate action regions and identify action categories in untrimmed videos simultaneously by taking only video-level labels as the supervision.
Two-Stream Networks for Weakly-Supervised Temporal Action Localization With Semantic-Aware Mechanisms
In this paper, we hypothesize that snippets with similar representations should be considered as the same action class despite the absence of supervision signals on each snippet.
Cascade Evidential Learning for Open-World Weakly-Supervised Temporal Action Localization
Targeting at recognizing and localizing action instances with only video-level labels during training, Weakly-supervised Temporal Action Localization (WTAL) has achieved significant progress in recent years.
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.