Zero-Shot Action Detection

3 papers with code • 2 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Prompting Visual-Language Models for Efficient Video Understanding

ju-chen/Efficient-Prompt 8 Dec 2021

Image-based visual-language (I-VL) pre-training has shown great success for learning joint visual-textual representations from large-scale web data, revealing remarkable ability for zero-shot generalisation.

Zero-Shot Temporal Action Detection via Vision-Language Prompting

sauradip/stale 17 Jul 2022

Such a novel design effectively eliminates the dependence between localization and classification by breaking the route for error propagation in-between.

UnLoc: A Unified Framework for Video Localization Tasks

google-research/scenic ICCV 2023

While large-scale image-text pretrained models such as CLIP have been used for multiple video-level tasks on trimmed videos, their use for temporal localization in untrimmed videos is still a relatively unexplored task.