Sound Event Detection

74 papers with code • 4 benchmarks • 18 datasets

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features

Libraries

Use these libraries to find Sound Event Detection models and implementations

Latest papers with no code

Sound Event Detection and Localization with Distance Estimation

no code yet • 18 Mar 2024

Sound Event Detection and Localization (SELD) is a combined task of identifying sound events and their corresponding direction-of-arrival (DOA).

Multitask frame-level learning for few-shot sound event detection

no code yet • 17 Mar 2024

This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples.

Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals

no code yet • 16 Mar 2024

Sound event detection (SED) is an active area of audio research that aims to detect the temporal occurrence of sounds.

Dual Knowledge Distillation for Efficient Sound Event Detection

no code yet • 5 Feb 2024

To address this issue, we introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems in this work.

BAT: Learning to Reason about Spatial Sounds with Large Language Models

no code yet • 2 Feb 2024

By integrating Spatial-AST with LLaMA-2 7B model, BAT transcends standard Sound Event Localization and Detection (SELD) tasks, enabling the model to reason about the relationships between the sounds in its environment.

tinyCLAP: Distilling Constrastive Language-Audio Pretrained Models

no code yet • 24 Nov 2023

Contrastive Language-Audio Pretraining (CLAP) became of crucial importance in the field of audio and speech processing.

Interactive Dual-Conformer with Scene-Inspired Mask for Soft Sound Event Detection

no code yet • 23 Nov 2023

In this paper, we first propose an interactive dual-conformer (IDC) module, in which a cross-interaction mechanism is applied to effectively exploit the information from soft labels.

SwG-former: A Sliding-Window Graph Convolutional Network for Simultaneous Spatial-Temporal Information Extraction in Sound Event Localization and Detection

no code yet • 21 Oct 2023

Sound event localization and detection (SELD) involves sound event detection (SED) and direction of arrival (DoA) estimation tasks.

Evaluating Classification Systems Against Soft Labels with Fuzzy Precision and Recall

no code yet • 25 Sep 2023

Classification systems are normally trained by minimizing the cross-entropy between system outputs and reference labels, which makes the Kullback-Leibler divergence a natural choice for measuring how closely the system can follow the data.

Online Active Learning For Sound Event Detection

no code yet • 25 Sep 2023

Online Active Learning (OAL) is a paradigm that addresses this issue by simultaneously minimizing the amount of annotation required to train a classifier and adapting to changes in the data over the duration of the data collection process.