Sound Event Localization and Detection

28 papers with code • 5 benchmarks • 8 datasets

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Benchmarks

Add a Result

These leaderboards are used to track progress in Sound Event Localization and Detection

Dataset	Best Model	Compare
STARSS22	Baseline (FOA)	See all
PodcastFillers	AVC-FillerNet	See all
TAU-NIGENS Spatial Sound Events 2021	SALSA-FOA	See all
L3DAS21	DualQSELD-TCN (parallel)	See all
RWCP Sound Scene Database	STL-SNN	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis

thomeou/SALSA • • 22 Jul 2021

Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation.

Paper
Code

SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection

thomeou/SALSA • • 1 Oct 2021

Sound event localization and detection (SELD) consists of two subtasks, which are sound event detection and direction-of-arrival estimation.

Paper
Code

Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection

rfalcon100/Spatial-Mixup-Pytorch • • 12 Oct 2021

Data augmentation methods have shown great importance in diverse supervised learning problems where labeled data is scarce or costly to obtain.

Paper
Code

Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head

nttrd-mdlab/wearable-seld-dataset • • 17 Feb 2022

Sound event localization and detection (SELD) is a combined task of identifying the sound event and its direction.

Paper
Code

Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments

nttrd-mdlab/seld-foa-meir • 18 Feb 2022

Our goal is to develop a sound event localization and detection (SELD) system that works robustly in unknown environments.

Paper
Code

L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

l3das/l3das22 • • 21 Feb 2022

The L3DAS22 Challenge is aimed at encouraging the development of machine learning strategies for 3D speech enhancement and 3D sound localization and detection in office-like environments.

Paper
Code

Filler Word Detection and Classification: A Dataset and Benchmark

gzhu06/PodcastFillers_Utils • 28 Mar 2022

In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions.

Paper
Code

Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation

ispamm/dualqseld-tcn • • 4 Apr 2022

We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field.

Paper
Code

A Synapse-Threshold Synergistic Learning Approach for Spiking Neural Networks

sunhongze/STL-SNN • • 10 Jun 2022

Most existing methods for training SNNs are based on the concept of synaptic plasticity; however, learning in the realistic brain also utilizes intrinsic non-synaptic mechanisms of neurons.

Paper
Code

Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains

jinbo-hu/dcase2022-task3 • • 5 Sep 2022

Our system submitted to the DCASE 2022 Task 3 is based on our previous proposed Event-Independent Network V2 (EINV2) with a novel data augmentation method.

Paper
Code

Sound Event Localization and Detection

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result