Video Recognition

147 papers with code • 0 benchmarks • 10 datasets

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Libraries

Use these libraries to find Video Recognition models and implementations
5 papers
3,904
3 papers
2,998
See all 9 libraries.

VG4D: Vision-Language Model Goes 4D Video Recognition

shark0-0/vg4d 17 Apr 2024

By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.

6
17 Apr 2024

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

nus-hpc-ai-lab/dynamic-tuning 18 Mar 2024

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

22
18 Mar 2024

Don't Judge by the Look: Towards Motion Coherent Video Representation

bespontaneous/mca-pytorch 14 Mar 2024

Current training pipelines in object recognition neglect Hue Jittering when doing data augmentation as it not only brings appearance changes that are detrimental to classification, but also the implementation is inefficient in practice.

4
14 Mar 2024

HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition

dun-research/haltingvt 10 Jan 2024

Action recognition in videos poses a challenge due to its high computational cost, especially for Joint Space-Time video transformers (Joint VT).

4
10 Jan 2024

Video Recognition in Portrait Mode

bytedance/Portrait-Mode-Video 21 Dec 2023

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

25
21 Dec 2023

Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition

event-ahu/tscformer 18 Dec 2023

It is intuitive to combine them for high-performance RGB-Event based video recognition, however, existing works fail to achieve a good balance between the accuracy and model parameters, as shown in Fig.~\ref{firstimage}.

5
18 Dec 2023

LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer

ziyuzhao-zzy/logostylefool 15 Dec 2023

We separate the attack into three stages: style reference selection, reinforcement-learning-based logo style transfer, and perturbation optimization.

1
15 Dec 2023

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

tomchen-ctj/OST 30 Nov 2023

Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.

28
30 Nov 2023

Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition

ftkr12/rostfine 10 Nov 2023

Infertility is a global health problem, and an increasing number of couples are seeking medical assistance to achieve reproduction, at least half of which are caused by men.

0
10 Nov 2023

Object-centric Video Representation for Long-term Action Anticipation

brown-palm/ObjectPrompt 31 Oct 2023

To recognize and predict human-object interactions, we use a Transformer-based neural architecture which allows the "retrieval" of relevant objects for action anticipation at various time scales.

3
31 Oct 2023