Sound Event Detection

75 papers with code • 4 benchmarks • 18 datasets

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features


Use these libraries to find Sound Event Detection models and implementations

Most implemented papers

Towards Deep Learning Models Resistant to Adversarial Attacks

MadryLab/mnist_challenge ICLR 2018

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions

elegan23/hypernets 8 Oct 2021

In this paper, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models.

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

xinhaomei/wavcaps 30 Mar 2023

To address this data scarcity issue, we introduce WavCaps, the first large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips with paired captions.

Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

yardencsGitHub/tf_syllable_segmentation_annotation 4 Apr 2016

In this paper we present an approach to polyphonic sound event detection in real life recordings based on bi-directional long short term memory (BLSTM) recurrent neural networks (RNNs).

Adaptive pooling operators for weakly labeled sound event detection

marl/autopool 26 Apr 2018

In this work, we treat SED as a multiple instance learning (MIL) problem, where training labels are static over a short excerpt, indicating the presence or absence of sound sources but not their temporal locality.

Learning Sound Event Classifiers from Web Audio with Noisy Labels

edufonseca/icassp19 4 Jan 2019

To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42. 5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.

SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks

giusenso/seld-tcn 3 Mar 2020

The understanding of the surrounding environment plays a critical role in autonomous robotic systems, such as self-driving cars.

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection

sharathadavanne/seld-dcase2021 29 Oct 2020

Conventional NN-based methods use two branches for a sound event detection (SED) target and a direction-of-arrival (DOA) target.

Couple Learning for semi-supervised sound event detection

Toshiba-RDC/dcase20_task4 12 Oct 2021

The recently proposed Mean Teacher method, which exploits large-scale unlabeled data in a self-ensembling manner, has achieved state-of-the-art results in several semi-supervised learning benchmarks.

RCT: Random Consistency Training for Semi-supervised Sound Event Detection

Audio-WestlakeU/RCT-Random-Consistency-Training 21 Oct 2021

Sound event detection (SED), as a core module of acoustic environmental analysis, suffers from the problem of data deficiency.