Audio Tagging

41 papers with code • 1 benchmarks • 8 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Tagging

Trend	Dataset	Best Model	Paper	Code	Compare
	AudioSet	CAV-MAE (Audio-Visual)			See all

Libraries

Use these libraries to find Audio Tagging models and implementations

fschmid56/efficientat

2 papers

181

Datasets

Most implemented papers

Most implemented Social Latest No code

CRNNs for Urban Sound Tagging with spatiotemporal context

multitel-ai/urban-sound-tagging • • 24 Aug 2020

This paper describes CRNNs we used to participate in Task 5 of the DCASE 2020 challenge.

Paper
Code

Efficient Training of Audio Transformers with Patchout

kkoutini/passt • • 11 Oct 2021

However, one of the main shortcomings of transformer models, compared to the well-established CNNs, is the computational complexity.

Paper
Code

Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation

fschmid56/efficientat • • 9 Nov 2022

We provide models of different complexity levels, scaling from low-complexity models up to a new state-of-the-art performance of . 483 mAP on AudioSet.

Paper
Code

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

audio-westlakeu/audiossl • • 7 Jun 2023

In order to tackle both clip-level and frame-level tasks, this paper proposes Audio Teacher-Student Transformer (ATST), with a clip-level version (named ATST-Clip) and a frame-level version (named ATST-Frame), responsible for learning clip-level and frame-level representations, respectively.

Paper
Code

Audio classification with Dilated Convolution with Learnable Spacings

k-h-ismail/dcls-audio • • 25 Sep 2023

Dilated convolution with learnable spacings (DCLS) is a recent convolution method in which the positions of the kernel elements are learned throughout training by backpropagation.

Paper
Code

Classifying Variable-Length Audio Files with All-Convolutional Networks and Masked Global Pooling

numpde/phonepad • • 11 Jul 2016

We trained a deep all-convolutional neural network with masked global pooling to perform single-label classification for acoustic scene classification and multi-label classification for domestic audio tagging in the DCASE-2016 contest.

Paper
Code

Audio Tagging With Connectionist Temporal Classification Model Using Sequential Labelled Data

iooops/CS221-Audio-Tagging • 6 Aug 2018

To use the order information of sound events, we propose sequential labelled data (SLD), where both the presence or absence and the order information of sound events are known.

Paper
Code

Guided learning for weakly-labeled semi-supervised sound event detection

Kikyo-16/Sound_event_detection • • 6 Jun 2019

Instead of designing a single model by considering a trade-off between the two sub-targets, we design a teacher model aiming at audio tagging to guide a student model aiming at boundary detection to learn using the unlabeled data.

Paper
Code

Evaluation of post-processing algorithms for polyphonic sound event detection

topel/dcase19-RCNN-task4 • • 17 Jun 2019

We compared post-processing algorithms on the temporal prediction curves of two models: one based on the challenge's baseline and a Multiple Instance Learning (MIL) model.

Paper
Code

DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events

Jungjee/DcaseNet • • 21 Sep 2020

Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.

Paper
Code

Audio Tagging

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result