Audio Tagging

41 papers with code • 1 benchmarks • 8 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Tagging

Trend	Dataset	Best Model	Paper	Code	Compare
	AudioSet	CAV-MAE (Audio-Visual)			See all

Libraries

Use these libraries to find Audio Tagging models and implementations

fschmid56/efficientat

2 papers

188

Datasets

Most implemented papers

Most implemented Social Latest No code

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

qiuqiangkong/audioset_tagging_cnn • • 23 Aug 2020

We transfer PANNs to six audio pattern recognition tasks, and demonstrate state-of-the-art performance in several of those tasks.

Paper
Code

Speech Denoising with Deep Feature Losses

anicolson/DeepXi • • 27 Jun 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

Paper
Code

musicnn: Pre-trained convolutional neural networks for music audio tagging

jordipons/musicnn • • 14 Sep 2019

Pronounced as "musician", the musicnn library contains a set of pre-trained musically motivated convolutional neural networks for music audio tagging: https://github. com/jordipons/musicnn.

Paper
Code

AST: Audio Spectrogram Transformer

YuanGongND/ast • • 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.

Paper
Code

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

iooops/CS221-Audio-Tagging • 26 Jul 2018

The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology.

Paper
Code

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

yongxuUSTC/aDAE_DNN_audio_tagging • 13 Jul 2016

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Paper
Code

Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging

yongxuUSTC/cnn_rnn_spatial_audio_tagging • 24 Feb 2017

In this paper, we propose to use a convolutional neural network (CNN) to extract robust features from mel-filter banks (MFBs), spectrograms or even raw waveforms for audio tagging.

Paper
Code

Speech Denoising Convolutional Neural Network trained with Deep Feature Losses.

francoisgermain/SpeechDenoisingWithDeepFeatureLosses • • Interspeech 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

Paper
Code

General audio tagging with ensembling convolutional neural network and statistical features

Cocoxili/DCASE2018Task2 • • 30 Oct 2018

Audio tagging is challenging due to the limited size of data and noisy labels.

Paper
Code

Audio tagging with noisy labels and minimal supervision

lRomul/argus-freesound • • 7 Jun 2019

The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.

Paper
Code

Audio Tagging

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result