Weakly-supervised Temporal Action Localization
32 papers with code • 2 benchmarks • 2 datasets
Temporal Action Localization with weak supervision where only video-level labels are given for training
Libraries
Use these libraries to find Weakly-supervised Temporal Action Localization models and implementationsLatest papers
Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach
It comprises two core components: a snippet clustering component that groups the snippets into multiple latent clusters and a cluster classification component that further classifies the cluster as foreground or background.
DDG-Net: Discriminability-Driven Graph Network for Weakly-supervised Temporal Action Localization
Considering this phenomenon, we propose Discriminability-Driven Graph Network (DDG-Net), which explicitly models ambiguous snippets and discriminative snippets with well-designed connections, preventing the transmission of ambiguous information and enhancing the discriminability of snippet-level representations.
Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization
Weakly-supervised temporal action localization aims to localize and recognize actions in untrimmed videos with only video-level category labels during training.
Boosting Weakly-Supervised Temporal Action Localization with Text Information
For the discriminative objective, we propose a Text-Segment Mining (TSM) mechanism, which constructs a text description based on the action class label, and regards the text as the query to mine all class-related segments.
Weakly-Supervised Temporal Action Localization with Bidirectional Semantic Consistency Constraint
The proposed Bi-SCC firstly adopts a temporal context augmentation to generate an augmented video that breaks the correlation between positive actions and their co-scene actions in the inter-video; Then, a semantic consistency constraint (SCC) is used to enforce the predictions of the original video and augmented video to be consistent, hence suppressing the co-scene actions.
Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels
Besides, the generated pseudo-labels can be fluctuating and inaccurate at the early stage of training.
Consistency-based Self-supervised Learning for Temporal Anomaly Localization
This work tackles Weakly Supervised Anomaly detection, in which a predictor is allowed to learn not only from normal examples but also from a few labeled anomalies made available during training.
Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning
Accordingly, we first exclude these surely non-existent categories by a complementary learning loss.
Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
C$^3$BN consists of two key ingredients: a micro data augmentation strategy that increases the diversity in-between adjacent snippets by convex combination of adjacent snippets, and a macro-micro consistency regularization that enforces the model to be invariant to the transformations~\textit{w. r. t.}
Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization
We target at the task of weakly-supervised action localization (WSAL), where only video-level action labels are available during model training.