61 papers with code • 1 benchmarks • 12 datasets
Detecting activities in extended videos.
LibrariesUse these libraries to find Activity Detection models and implementations
This thesis explore different approaches using Convolutional and Recurrent Neural Networks to classify and temporally localize activities on videos, furthermore an implementation to achieve it has been proposed.
We propose a single neural network architecture for two tasks: on-line keyword spotting and voice activity detection.
In the end, a posteriori SNR weighted energy difference is applied to the extended pitch segments of the denoised speech signal for detecting voice activity.
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization
End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once.
In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos.
With presence detection, how to collect training data with human presence can have a significant impact on the performance.