Sound Event Localization and Detection

28 papers with code • 5 benchmarks • 8 datasets

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Most implemented papers

What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis

thomeou/SALSA 22 Jul 2021

Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation.

SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection

thomeou/SALSA 1 Oct 2021

Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation.

Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection

rfalcon100/Spatial-Mixup-Pytorch 12 Oct 2021

Data augmentation methods have shown great importance in diverse supervised learning problems where labeled data is scarce or costly to obtain.

Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head

nttrd-mdlab/wearable-seld-dataset 17 Feb 2022

Sound event localization and detection (SELD) is a combined task of identifying the sound event and its direction.

Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments

nttrd-mdlab/seld-foa-meir 18 Feb 2022

Our goal is to develop a sound event localization and detection (SELD) system that works robustly in unknown environments.

L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

l3das/l3das22 21 Feb 2022

The L3DAS22 Challenge is aimed at encouraging the development of machine learning strategies for 3D speech enhancement and 3D sound localization and detection in office-like environments.

Filler Word Detection and Classification: A Dataset and Benchmark

gzhu06/PodcastFillers_Utils 28 Mar 2022

In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions.

Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation

ispamm/dualqseld-tcn 4 Apr 2022

We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field.

A Synapse-Threshold Synergistic Learning Approach for Spiking Neural Networks

sunhongze/STL-SNN 10 Jun 2022

Most existing methods for training SNNs are based on the concept of synaptic plasticity; however, learning in the realistic brain also utilizes intrinsic non-synaptic mechanisms of neurons.

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains

jinbo-hu/dcase2022-task3 5 Sep 2022

Our system submitted to the DCASE 2022 Task 3 is based on our previous proposed Event-Independent Network V2 (EINV2) with a novel data augmentation method.