Temporal Localization
55 papers with code • 0 benchmarks • 3 datasets
Benchmarks
These leaderboards are used to track progress in Temporal Localization
Libraries
Use these libraries to find Temporal Localization models and implementationsLatest papers with no code
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Temporal localization of driving actions plays a crucial role in advanced driver-assistance systems and naturalistic driving studies.
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
The model uses 2D-pose features as the positional embedding of the transformer architecture and spatio-temporal features as the main input to the encoder of the transformer.
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog
OLViT addresses these challenges by maintaining a global dialog state based on the output of an Object State Tracker (OST) and a Language State Tracker (LST): while the OST attends to the most important objects within the video, the LST keeps track of the most important linguistic co-references to previous dialog turns.
Deep-Learning-Assisted Analysis of Cataract Surgery Videos
(2) This thesis proposes a novel deep-learning-based framework for relevance-based compression to enable real-time streaming and adaptive storage of cataract surgery videos.
Survey of Action Recognition, Spotting and Spatio-Temporal Localization in Soccer -- Current Trends and Research Perspectives
Overall, this survey provides a valuable resource for researchers interested in the field of action scene understanding in soccer.
Cross-Video Contextual Knowledge Exploration and Exploitation for Ambiguity Reduction in Weakly Supervised Temporal Action Localization
Further, the GKSA module is used to efficiently summarize and propagate the cross-video representative action knowledge in a learnable manner to promote holistic action patterns understanding, which in turn allows the generation of high-confidence pseudo-labels for self-learning, thus alleviating ambiguity in temporal localization.
Single-Stage Visual Query Localization in Egocentric Videos
Our key idea is to first build a holistic understanding of the query-video relationship and then perform spatio-temporal localization in a single shot manner.
Autonomous Stabilization of Retinal Videos for Streamlining Assessment of Spontaneous Venous Pulsations
Both of the evaluations support its effectiveness in facilitating the observation of SVPs.
Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding
Existing video-language pre-training methods primarily focus on instance-level alignment between video clips and captions via global contrastive learning but neglect rich fine-grained local information in both videos and text, which is of importance to downstream tasks requiring temporal localization and semantic reasoning.
VADER: Video Alignment Differencing and Retrieval
We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos.