51 papers with code • 0 benchmarks • 12 datasets
Detecting activities in extended videos.
These leaderboards are used to track progress in Activity Detection
This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed.
We propose a single neural network architecture for two tasks: on-line keyword spotting and voice activity detection.
In the end, a posteriori SNR weighted energy difference is applied to the extended pitch segments of the denoised speech signal for detecting voice activity.
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once.
In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos.
We also report the performance on the ROAD tasks of Slowfast and YOLOv5 detectors, as well as that of the winners of the ICCV2021 ROAD challenge, which highlight the challenges faced by situation awareness in autonomous driving.