Video Recognition

147 papers with code • 0 benchmarks • 10 datasets

Video Recognition is a process of obtaining, processing, and analysing data that it receives from a visual source, specifically video.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Recognition

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Video Recognition models and implementations

open-mmlab/mmaction2

5 papers

3,904

open-mmlab/mmtracking

3 papers

3,384

facebookresearch/pytorchvideo

3 papers

3,186

towhee-io/towhee

3 papers

2,998

See all 9 libraries.

Datasets

Latest papers

Most implemented Social Latest No code

VG4D: Vision-Language Model Goes 4D Video Recognition

shark0-0/vg4d • 17 Apr 2024

By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.

17 Apr 2024

Paper
Code

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

nus-hpc-ai-lab/dynamic-tuning • • 18 Mar 2024

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

18 Mar 2024

Paper
Code

Don't Judge by the Look: Towards Motion Coherent Video Representation

bespontaneous/mca-pytorch • • 14 Mar 2024

Current training pipelines in object recognition neglect Hue Jittering when doing data augmentation as it not only brings appearance changes that are detrimental to classification, but also the implementation is inefficient in practice.

14 Mar 2024

Paper
Code

HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition

dun-research/haltingvt • • 10 Jan 2024

Action recognition in videos poses a challenge due to its high computational cost, especially for Joint Space-Time video transformers (Joint VT).

10 Jan 2024

Paper
Code

Video Recognition in Portrait Mode

bytedance/Portrait-Mode-Video • • 21 Dec 2023

While existing datasets mainly comprise landscape mode videos, our paper seeks to introduce portrait mode videos to the research community and highlight the unique challenges associated with this video format.

21 Dec 2023

Paper
Code

Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition

event-ahu/tscformer • • 18 Dec 2023

It is intuitive to combine them for high-performance RGB-Event based video recognition, however, existing works fail to achieve a good balance between the accuracy and model parameters, as shown in Fig.~\ref{firstimage}.

18 Dec 2023

Paper
Code

LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer

ziyuzhao-zzy/logostylefool • • 15 Dec 2023

We separate the attack into three stages: style reference selection, reinforcement-learning-based logo style transfer, and perturbation optimization.

15 Dec 2023

Paper
Code

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

tomchen-ctj/OST • • 30 Nov 2023

Due to the resource-intensive nature of training vision-language models on expansive video data, a majority of studies have centered on adapting pre-trained image-language models to the video domain.

30 Nov 2023

Paper
Code

Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition

ftkr12/rostfine • • 10 Nov 2023

Infertility is a global health problem, and an increasing number of couples are seeking medical assistance to achieve reproduction, at least half of which are caused by men.

10 Nov 2023

Paper
Code

Object-centric Video Representation for Long-term Action Anticipation

brown-palm/ObjectPrompt • • 31 Oct 2023

To recognize and predict human-object interactions, we use a Transformer-based neural architecture which allows the "retrieval" of relevant objects for action anticipation at various time scales.

31 Oct 2023

Paper
Code

Video Recognition

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result