Sound Event Localization and Detection

28 papers with code • 5 benchmarks • 8 datasets

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Most implemented papers

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

sharathadavanne/seld-dcase2022 6 Sep 2020

A large-scale realistic dataset of spatialized sound events was generated for the challenge, to be used for training of learning-based approaches, and for evaluation of the submissions in an unlabeled subset.

SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays

thomeou/salsa-lite 16 Nov 2021

In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs.

SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks

giusenso/seld-tcn 3 Mar 2020

The understanding of the surrounding environment plays a critical role in autonomous robotic systems, such as self-driving cars.

A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection

yinkalario/EIN-SELD 2 Jun 2020

This report presents the dataset and the evaluation setup of the Sound Event Localization & Detection (SELD) task for the DCASE 2020 Challenge.

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection

sharathadavanne/seld-dcase2021 29 Oct 2020

Conventional NN-based methods use two branches for a sound event detection (SED) target and a direction-of-arrival (DOA) target.

Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training

sharathadavanne/seld-dcase2022 14 Oct 2021

The multi- ACCDOA format (a class- and track-wise output format) enables the model to solve the cases with overlaps from the same class.

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

sharathadavanne/seld-dcase2022 4 Jun 2022

Additionally, the report presents the baseline system that accompanies the dataset in the challenge with emphasis on the differences with the baseline of the previous iterations; namely, introduction of the multi-ACCDOA representation to handle multiple simultaneous occurences of events of the same class, and support for additional improved input features for the microphone array format.

A hybrid parametric-deep learning approach for sound event localization and detection

andresperezlopez/DCASE2019_task3 27 Aug 2019

This work describes and discusses an algorithm submitted to the Sound Event Localization and Detection Task of DCASE2019 Challenge.

A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection

sharathadavanne/seld-dcase2021 13 Jun 2021

This report presents the dataset and baseline of Task 3 of the DCASE2021 Challenge on Sound Event Localization and Detection (SELD).

DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection

thomeou/SALSA 29 Jun 2021

Sound event localization and detection consists of two subtasks which are sound event detection and direction-of-arrival estimation.