Sound Event Detection
74 papers with code • 4 benchmarks • 18 datasets
Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.
Source: A report on sound event detection with different binaural features
Libraries
Use these libraries to find Sound Event Detection models and implementationsDatasets
Latest papers
UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization
Video localization tasks aim to temporally locate specific instances in videos, including temporal action localization (TAL), sound event detection (SED) and audio-visual event localization (AVEL).
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection
A recent development in the field is the introduction of the task known as few-shot bioacoustic sound event detection, which aims to train a versatile animal sound detector using only a small set of audio samples.
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
Recently, 2D convolution has been found unqualified in sound event detection (SED).
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection
Sound event localization and detection (SELD) combines two subtasks: sound event detection (SED) and direction of arrival (DOA) estimation.
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
By applying this approach to SELD, we can leverage a substantial amount of unlabeled 3D audio data to learn robust representations of sound events and their locations.
AudioLog: LLMs-Powered Long Audio Logging with Hybrid Token-Semantic Contrastive Learning
This paper presents AudioLog, a large language models (LLMs)-powered audio logging system with hybrid token-semantic contrastive learning.
Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems
In recent years, deep learning systems have shown a concerning trend toward increased complexity and higher energy consumption.
Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection
Bioacoustic sound event detection allows for better understanding of animal behavior and for better monitoring biodiversity using audio.
Fine-tune the pretrained ATST model for sound event detection
In this work, we study the fine-tuning method of the pretrained models for SED.
Leveraging Geometrical Acoustic Simulations of Spatial Room Impulse Responses for Improved Sound Event Detection and Localization
As deeper and more complex models are developed for the task of sound event localization and detection (SELD), the demand for annotated spatial audio data continues to increase.