Activity Detection

63 papers with code • 1 benchmarks • 12 datasets

Detecting activities in extended videos.

Benchmarks

Add a Result

These leaderboards are used to track progress in Activity Detection

Trend	Dataset	Best Model	Paper	Code	Compare
	AVA-Speech	CNN-BiLSTM_best			See all

Libraries

Use these libraries to find Activity Detection models and implementations

alibaba-damo-academy/FunASR

3 papers

3,096

Datasets

Latest papers

Most implemented Social Latest No code

Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0

mkunes/w2v2_audioframeclassification • • 26 Oct 2022

In this paper, we explore the effectiveness of this model on three basic speech classification tasks: speaker change detection, overlapped speech detection, and voice activity detection.

26 Oct 2022

Paper
Code

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

marianne-m/brouhaha-vad • • 24 Oct 2022

Most automatic speech processing systems register degraded performance when applied to noisy or reverberant speech.

108

24 Oct 2022

Paper
Code

MM-ALT: A Multimodal Automatic Lyric Transcription System

guxm2021/MM_ALT • • 13 Jul 2022

Automatic lyric transcription (ALT) is a nascent field of study attracting increasing interest from both the speech and music information retrieval communities, given its significant application potential.

13 Jul 2022

Paper
Code

A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vessels

marthadais/aisclassification • • 12 Jul 2022

To this end, we leverage the unsupervised nature of cluster analysis to label the trajectory geometry highlighting the changes in the vessel's moving pattern which tends to indicate fishing activity.

12 Jul 2022

Paper
Code

Adversarial Multi-Task Deep Learning for Noise-Robust Voice Activity Detection with Low Algorithmic Delay

aau-es-ml/VAD-with-adversarial-multi-task-learning • • 4 Jul 2022

Introducing adversarial multi-task learning to the model is observed to increase performance in terms of Area Under Curve (AUC), particularly in noisy environments, while the performance is not degraded at higher SNR levels.

04 Jul 2022

Paper
Code

Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering

idiap/zff_vad • 27 Jun 2022

This paper investigates the potential of zero-frequency filtering for jointly modeling voice source and vocal tract system information, and proposes two approaches for VAD.

27 Jun 2022

Paper
Code

Low-Latency Speech Separation Guided Diarization for Telephone Conversations

dr-pato/ssgd • • 5 Apr 2022

In particular, we compare two low-latency speech separation models.

05 Apr 2022

Paper
Code

Gan-Based Joint Activity Detection and Channel Estimation For Grant-free Random Access

deeeeeeplearning/jadce • • 4 Apr 2022

Joint activity detection and channel estimation (JADCE) for grant-free random access is a critical issue that needs to be addressed to support massive connectivity in IoT networks.

04 Apr 2022

Paper
Code

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

alibaba-damo-academy/FunASR • • 18 Mar 2022

Through this formulation, we propose the speaker embedding-aware neural diarization (SEND) framework, where a speech encoder, a speaker encoder, two similarity scorers, and a post-processing network are jointly optimized to predict the encoded labels according to the similarities between speech features and speaker embeddings.

3,096

18 Mar 2022

Paper
Code

HGCN: Harmonic gated compensation network for speech enhancement

wangtianrui/HGCN • • 30 Jan 2022

Mask processing in the time-frequency (T-F) domain through the neural network has been one of the mainstreams for single-channel speech enhancement.

30 Jan 2022

Paper
Code

Activity Detection

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result