Audio Tagging

41 papers with code • 1 benchmarks • 8 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Libraries

Use these libraries to find Audio Tagging models and implementations

Most implemented papers

CRNNs for Urban Sound Tagging with spatiotemporal context

multitel-ai/urban-sound-tagging 24 Aug 2020

This paper describes CRNNs we used to participate in Task 5 of the DCASE 2020 challenge.

Efficient Training of Audio Transformers with Patchout

kkoutini/passt 11 Oct 2021

However, one of the main shortcomings of transformer models, compared to the well-established CNNs, is the computational complexity.

Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation

fschmid56/efficientat 9 Nov 2022

We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of . 483 mAP on AudioSet.

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

audio-westlakeu/audiossl 7 Jun 2023

In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.

Audio classification with Dilated Convolution with Learnable Spacings

k-h-ismail/dcls-audio 25 Sep 2023

Dilated convolution with learnable spacings (DCLS) is a recent convolution method in which the positions of the kernel elements are learned throughout training by backpropagation.

Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling

numpde/phonepad 11 Jul 2016

We trained a deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi-label classification for domestic audio tagging in the DCASE-2016 contest.

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

iooops/CS221-Audio-Tagging 6 Aug 2018

To use the order information of sound events, we propose sequential labelled data (SLD), where both the presence or absence and the order information of sound events are known.

Guided learning for weakly-labeled semi-supervised sound event detection

Kikyo-16/Sound_event_detection 6 Jun 2019

Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data.

Evaluation of post-processing algorithms for polyphonic sound event detection

topel/dcase19-RCNN-task4 17 Jun 2019

We compared post-processing algorithms on the temporal prediction curves of two models: one based on the challenge's baseline and a Multiple Instance Learning (MIL) model.

DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events

Jungjee/DcaseNet 21 Sep 2020

Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.