Sound Event Detection
74 papers with code • 4 benchmarks • 18 datasets
Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.
Source: A report on sound event detection with different binaural features
Libraries
Use these libraries to find Sound Event Detection models and implementationsDatasets
Latest papers with no code
Sound Event Detection and Localization with Distance Estimation
Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA).
Multitask frame-level learning for few-shot sound event detection
This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples.
Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals
Sound event detection (SED) is an active area of audio research that aims to detect the temporal occurrence of sounds.
Dual Knowledge Distillation for Efficient Sound Event Detection
To address this issue, we introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems in this work.
BAT: Learning to Reason about Spatial Sounds with Large Language Models
By integrating Spatial-AST with LLaMA-2 7B model, BAT transcends standard Sound Event Localization and Detection (SELD) tasks, enabling the model to reason about the relationships between the sounds in its environment.
tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models
Contrastive Language-Audio Pretraining (CLAP) became of crucial importance in the field of audio and speech processing.
Interactive Dual-Conformer with Scene-Inspired Mask for Soft Sound Event Detection
In this paper, we first propose an interactive dual-conformer (IDC) module, in which a cross-interaction mechanism is applied to effectively exploit the information from soft labels.
SwG-former: A Sliding-Window Graph Convolutional Network for Simultaneous Spatial-Temporal Information Extraction in Sound Event Localization and Detection
Sound event localization and detection (SELD) involves sound event detection (SED) and direction of arrival (DoA) estimation tasks.
Evaluating Classification Systems Against Soft Labels with Fuzzy Precision and Recall
Classification systems are normally trained by minimizing the cross-entropy between system outputs and reference labels, which makes the Kullback-Leibler divergence a natural choice for measuring how closely the system can follow the data.
Online Active Learning For Sound Event Detection
Online Active Learning (OAL) is a paradigm that addresses this issue by simultaneously minimizing the amount of annotation required to train a classifier and adapting to changes in the data over the duration of the data collection process.