Sound Event Detection
72 papers with code • 4 benchmarks • 18 datasets
Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.
Source: A report on sound event detection with different binaural features
Libraries
Use these libraries to find Sound Event Detection models and implementationsDatasets
Latest papers with no code
Online Active Learning For Sound Event Detection
Online Active Learning (OAL) is a paradigm that addresses this issue by simultaneously minimizing the amount of annotation required to train a classifier and adapting to changes in the data over the duration of the data collection process.
Semi-supervised Sound Event Detection with Local and Global Consistency Regularization
Then, the local consistency is adopted to encourage the model to leverage local features for frame-level predictions, and the global consistency is applied to force features to align with global prototypes through a specially designed contrastive loss.
Furnishing Sound Event Detection with Language Model Abilities
Recently, the ability of language models (LMs) has attracted increasing attention in visual cross-modality.
DiffSED: Sound Event Detection with Denoising Diffusion
In this work, we reformulate the SED problem by taking a generative learning perspective.
Auditory Neural Response Inspired Sound Event Detection Based on Spectro-temporal Receptive Field
In this work, we utilized STRF as a kernel of the first convolutional layer in SED model to extract neural response from input sound to make SED model similar to human auditory system.
Channel-Spatial-Based Few-Shot Bird Sound Event Detection
In this paper, we propose a model for bird sound event detection that focuses on a small number of training samples within the everyday long-tail distribution.
Semi-supervsied Learning-based Sound Event Detection using Freuqency Dynamic Convolution with Large Kernel Attention for DCASE Challenge 2023 Task 4
The proposed FDY with LKA-CRNN with a BEATs embedding network is initially trained on the entire DCASE 2023 Task 4 dataset using the mean-teacher approach, generating pseudo-labels for weakly labeled, unlabeled, and the AudioSet.
Divided spectro-temporal attention for sound event localization and detection in real scenes for DCASE2023 challenge
Localizing sounds and detecting events in different room environments is a difficult task, mainly due to the wide range of reflections and reverberations.
A Multi-Task Learning Framework for Sound Event Detection using High-level Acoustic Characteristics of Sounds
Sound event detection (SED) entails identifying the type of sound and estimating its temporal boundaries from acoustic signals.
Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations
This work investigates pretrained audio representations for few shot Sound Event Detection.