Action Detection

233 papers with code • 11 benchmarks • 33 datasets

Action Detection aims to find both where and when an action occurs within a video clip and classify what the action is taking place. Typically results are given in the form of action tublets, which are action bounding boxes linked across time in the video. This is related to temporal localization, which seeks to identify the start and end frame of an action, and action recognition, which seeks only to classify which action is taking place and typically assumes a trimmed video.

Benchmarks

Add a Result

These leaderboards are used to track progress in Action Detection

Dataset	Best Model	Compare
J-HMDB	HIT	See all
Charades	TTM	See all
UCF101-24	STAR/L	See all
Multi-THUMOS	MLAD	See all
UCF Sports	T-CNN	See all
THUMOS' 14	MAT (Ours) Trans	See all
TSU	PDAN	See all
TTStroke-21 ME22	STCNN-V2 (Vote decision)	See all
TTStroke-21 ME21	STCNN	See all
MultiSports	HIT	See all
MultiTHUMOS	PAT	See all

Show all 11 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Action Detection models and implementations

open-mmlab/mmaction2

6 papers

3,892

alibaba-damo-academy/FunASR

3 papers

3,299

Frostinassiky/gtad

3 papers

216

towhee-io/towhee

2 papers

2,991

See all 6 libraries.

Datasets

Subtasks

Audio-Visual Active Speaker Detection

Fine-Grained Action Detection

Action Triplet Detection

Few Shot Temporal Action Localization

Multiple Action Detection

Latest papers with no code

Most implemented Social Latest No code

A Customer Level Fraudulent Activity Detection Benchmark for Enhancing Machine Learning Model Research and Evaluation

no code yet • 23 Apr 2024

In the field of fraud detection, the availability of comprehensive and privacy-compliant datasets is crucial for advancing machine learning research and developing effective anti-fraud systems.

Paper
Add Code

Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection

no code yet • 17 Apr 2024

The integration of Light Detection and Ranging (LiDAR) and Internet of Things (IoT) technologies offers transformative opportunities for public health informatics in urban safety and pedestrian well-being.

Paper
Add Code

STMixer: A One-Stage Sparse Action Detector

no code yet • 15 Apr 2024

First, we present a query-based adaptive feature sampling module, which endows the detector with the flexibility of mining a group of discriminative features from the entire spatio-temporal domain.

Paper
Add Code

TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression

no code yet • 3 Apr 2024

In this paper, we investigate that the normalized coordinate expression is a key factor as reliance on hand-crafted components in query-based detectors for temporal action detection (TAD).

Paper
Add Code

Action Detection via an Image Diffusion Process

no code yet • 1 Apr 2024

Action detection aims to localize the starting and ending points of action instances in untrimmed videos, and predict the classes of those instances.

Paper
Add Code

Dual DETRs for Multi-Label Temporal Action Detection

no code yet • 31 Mar 2024

To address this issue, we propose a new Dual-level query-based TAD framework, namely DualDETR, to detect actions from both instance-level and boundary-level.

Paper
Add Code

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

no code yet • 29 Mar 2024

In this paper, we extensively analyze the robustness of seven leading TAD methods and obtain some interesting findings: 1) Existing methods are particularly vulnerable to temporal corruptions, and end-to-end methods are often more susceptible than those with a pre-trained feature extractor; 2) Vulnerability mainly comes from localization error rather than classification error; 3) When corruptions occur in the middle of an action instance, TAD models tend to yield the largest performance drop.

Paper
Add Code

Deep Learning-Assisted Parallel Interference Cancellation for Grant-Free NOMA in Machine-Type Communication

no code yet • 12 Mar 2024

The third framework is designed to accommodate the non-coherent scheme involving a small number of data bits, which simultaneously performs AD and DD.

Paper
Add Code

Detection of Object Throwing Behavior in Surveillance Videos

no code yet • 11 Mar 2024

Second, we compare the performance of different feature extractors for our anomaly detection method on the UCF-Crime and Throwing-Action datasets.

Paper
Add Code

Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications

no code yet • 11 Mar 2024

Past studies on end-to-end meeting transcription have focused on model architecture and have mostly been evaluated on simulated meeting data.

Paper
Add Code

Action Detection

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result