Zero-Shot Action Recognition on UCF101
3 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Zero-Shot Action Recognition on UCF101
Most implemented papers
Learning Spatiotemporal Features via Video and Text Pair Discrimination
In addition, our CPD model yields a new state of the art for zero-shot action recognition on UCF101 by directly utilizing the learnt visual-textual embeddings.
Orthogonal Temporal Interpolation for Zero-Shot Video Recognition
We propose a model called OTI for ZSVR by employing orthogonal temporal interpolation and the matching loss based on VLMs.
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.