Sound Event Localization and Detection

28 papers with code • 5 benchmarks • 8 datasets

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Benchmarks

Add a Result

These leaderboards are used to track progress in Sound Event Localization and Detection

Dataset	Best Model	Compare
STARSS22	Baseline (FOA)	See all
PodcastFillers	AVC-FillerNet	See all
TAU-NIGENS Spatial Sound Events 2021	SALSA-FOA	See all
L3DAS21	DualQSELD-TCN (parallel)	See all
RWCP Sound Scene Database	STL-SNN	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

sharathadavanne/seld-dcase2022 • • 6 Sep 2020

A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset.

Paper
Code

SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays

thomeou/salsa-lite • • 16 Nov 2021

In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs.

Paper
Code

SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks

giusenso/seld-tcn • • 3 Mar 2020

The understanding of the surrounding environment plays a critical role in autonomous robotic systems, such as self-driving cars.

Paper
Code

A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection

yinkalario/EIN-SELD • • 2 Jun 2020

This report presents the dataset and the evaluation setup of the Sound Event Localization & Detection (SELD) task for the DCASE 2020 Challenge.

Paper
Code

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection

sharathadavanne/seld-dcase2021 • 29 Oct 2020

Conventional NN-based methods use two branches for a sound event detection (SED) target and a direction-of-arrival (DOA) target.

Paper
Code

Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training

sharathadavanne/seld-dcase2022 • • 14 Oct 2021

The multi- ACCDOA format (a class- and track-wise output format) enables the model to solve the cases with overlaps from the same class.

Paper
Code

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

sharathadavanne/seld-dcase2022 • • 4 Jun 2022

Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.

Paper
Code