Activity Detection
63 papers with code • 1 benchmarks • 12 datasets
Detecting activities in extended videos.
Libraries
Use these libraries to find Activity Detection models and implementationsDatasets
Latest papers with no code
Activity Detection for Massive Connectivity in Cell-free Networks with Unknown Large-scale Fading, Channel Statistics, Noise Variance, and Activity Probability: A Bayesian Approach
This problem is even more severe in cell-free networks as there are many of these parameters to be acquired.
Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization
The proposed method can take audio-visual input and leverage the speaker's acoustic footprint or lip track to flexibly conduct audio-based, video-based, and audio-visual speaker diarization in a unified sequence-to-sequence framework.
Single-Microphone Speaker Separation and Voice Activity Detection in Noisy and Reverberant Environments
Speech separation involves extracting an individual speaker's voice from a multi-speaker audio signal.
Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions
Our experiments show that self-supervised pretraining not only improves performance in clean conditions, but also yields models which are more robust to adverse conditions compared to purely supervised learning.
Spatiotemporal Event Graphs for Dynamic Scene Understanding
In this thesis, we present a series of frameworks for dynamic scene understanding starting from road event detection from an autonomous driving perspective to complex video activity detection, followed by continual learning approaches for the life-long learning of the models.
Towards More Practical Group Activity Detection: A New Benchmark and Model
Group activity detection (GAD) is the task of identifying members of each group and classifying the activity of the group at the same time in a video.
SPIRE-SIES: A Spontaneous Indian English Speech Corpus
Transcripts for 23 hours is generated and validated which can serve as a spontaneous speech ASR benchmark.
Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements
This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques.
A Hybrid Graph Network for Complex Activity Detection in Video
Attention is then applied to this graph to obtain an overall representation of the local dynamic scene.
Prompt-driven Target Speech Diarization
We introduce a novel task named `target speech diarization', which seeks to determine `when target event occurred' within an audio signal.