Activity Detection
63 papers with code • 1 benchmarks • 12 datasets
Detecting activities in extended videos.
Libraries
Use these libraries to find Activity Detection models and implementationsDatasets
Most implemented papers
VoxLingua107: a Dataset for Spoken Language Recognition
Speech activity detection and speaker diarization are used to extract segments from the videos that contain speech.
ROAD: The ROad event Awareness Dataset for Autonomous Driving
We also report the performance on the ROAD tasks of Slowfast and YOLOv5 detectors, as well as that of the winners of the ICCV2021 ROAD challenge, which highlight the challenges faced by situation awareness in autonomous driving.
End-to-end speaker segmentation for overlap-aware resegmentation
Experiments on multiple speaker diarization datasets conclude that our model can be used with great success on both voice activity detection and overlapped speech detection.
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications
We propose a system that combines SAD and a BERT model to perform speaker change detection and speaker role detection (SRD) by chunking ASR transcripts, i. e., SD with a defined number of speakers together with SRD.
Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information
In this paper, we reformulate this task as a single-label prediction problem by encoding the multi-speaker labels with power set.
Unstructured Human Activity Detection from RGBD Images
Being able to detect and recognize human activities is essential for several applications, including personal assistive robotics.
ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding
In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize.
Untrimmed Video Classification for Activity Detection: submission to ActivityNet Challenge
We propose a simple, yet effective, method for the temporal detection of activities in temporally untrimmed videos with the help of untrimmed classification.
A Pursuit of Temporal Accuracy in General Activity Detection
Detecting activities in untrimmed videos is an important but challenging task.
Protest Activity Detection and Perceived Violence Estimation from Social Media Images
We also release the UCLA Protest Image Dataset, our novel dataset of 40, 764 images (11, 659 protest images and hard negatives) with various annotations of visual attributes and sentiments.