Sound Event Detection

74 papers with code • 4 benchmarks • 18 datasets

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features

Libraries

Use these libraries to find Sound Event Detection models and implementations

Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive Learning

ilyassmoummad/dcase23_task5_scl 2 Sep 2023

The bioacoustic community recasted the problem of sound event detection within the framework of few-shot learning, i. e. training a system with only few labeled examples.

7
02 Sep 2023

Post-Processing Independent Evaluation of Sound Event Detection Systems

fgnt/sed_scores_eval 27 Jun 2023

It summarizes the system performance over a range of operating modes resulting from varying the decision threshold that is used to translate the system output scores into a binary detection output.

22
27 Jun 2023

Few-shot bioacoustic event detection at the DCASE 2023 challenge

c4dm/dcase-few-shot-bioacoustic 15 Jun 2023

Few-shot bioacoustic event detection consists in detecting sound events of specified types, in varying soundscapes, while having access to only a few examples of the class of interest.

44
15 Jun 2023

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

audio-westlakeu/audiossl 7 Jun 2023

In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.

65
07 Jun 2023

Adversarial Representation Learning for Robust Privacy Preservation in Audio

lndip/rdal 29 Apr 2023

In this study, we propose a novel adversarial training method for learning representations of audio recordings that effectively prevents the detection of speech activity from the latent features of the recordings.

0
29 Apr 2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

xinhaomei/wavcaps 30 Mar 2023

To address this data scarcity issue, we introduce WavCaps, the first large-scale weakly-labelled audio captioning dataset, comprising approximately 400k audio clips with paired captions.

170
30 Mar 2023

AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection

sadPororo/AD-YOLO 28 Mar 2023

Hence, the format enables the model to handle the polyphony problem, regardless of the number of sound overlaps.

18
28 Mar 2023

A dataset for Audio-Visual Sound Event Detection in Movies

usc-sail/mica-subtitle-aligned-movie-sounds 14 Feb 2023

In this work, we present a dataset of audio events called Subtitle-Aligned Movie Sounds (SAM-S).

21
14 Feb 2023

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks

j22melody/sed_great_ape 5 Jan 2023

We present a novel approach to automatically detect and classify great ape calls from continuous raw audio recordings collected during field research.

3
05 Jan 2023

On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors

zaharah/ood_audio 27 Oct 2022

Out-of-distribution (OOD) detection is concerned with identifying data points that do not belong to the same distribution as the model's training data.

4
27 Oct 2022