Temporal Localization

55 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Temporal Localization models and implementations

Latest papers with no code

Density-Guided Label Smoothing for Temporal Localization of Driving Actions

no code yet • 11 Mar 2024

Temporal localization of driving actions plays a crucial role in advanced driver-assistance systems and naturalistic driving studies.

Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition

no code yet • 11 Mar 2024

The model uses 2D-pose features as the positional embedding of the transformer architecture and spatio-temporal features as the main input to the encoder of the transformer.

OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog

no code yet • 20 Feb 2024

OLViT addresses these challenges by maintaining a global dialog state based on the output of an Object State Tracker (OST) and a Language State Tracker (LST): while the OST attends to the most important objects within the video, the LST keeps track of the most important linguistic co-references to previous dialog turns.

Deep-Learning-Assisted Analysis of Cataract Surgery Videos

no code yet • 10 Dec 2023

(2) This thesis proposes a novel deep-learning-based framework for relevance-based compression to enable real-time streaming and adaptive storage of cataract surgery videos.

Survey of Action Recognition, Spotting and Spatio-Temporal Localization in Soccer -- Current Trends and Research Perspectives

no code yet • 21 Sep 2023

Overall, this survey provides a valuable resource for researchers interested in the field of action scene understanding in soccer.

Cross-Video Contextual Knowledge Exploration and Exploitation for Ambiguity Reduction in Weakly Supervised Temporal Action Localization

no code yet • 24 Aug 2023

Further, the GKSA module is used to efficiently summarize and propagate the cross-video representative action knowledge in a learnable manner to promote holistic action patterns understanding, which in turn allows the generation of high-confidence pseudo-labels for self-learning, thus alleviating ambiguity in temporal localization.

Single-Stage Visual Query Localization in Egocentric Videos

no code yet • NeurIPS 2023

Our key idea is to first build a holistic understanding of the query-video relationship and then perform spatio-temporal localization in a single shot manner.

Autonomous Stabilization of Retinal Videos for Streamlining Assessment of Spontaneous Venous Pulsations

no code yet • 10 May 2023

Both of the evaluations support its effectiveness in facilitating the observation of SVPs.

Structured Video-Language Modeling with Temporal Grouping and Spatial Grounding

no code yet • 28 Mar 2023

Existing video-language pre-training methods primarily focus on instance-level alignment between video clips and captions via global contrastive learning but neglect rich fine-grained local information in both videos and text, which is of importance to downstream tasks requiring temporal localization and semantic reasoning.

VADER: Video Alignment Differencing and Retrieval

no code yet • ICCV 2023

We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos.