Audio Tagging

42 papers with code • 1 benchmarks • 9 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Libraries

Use these libraries to find Audio Tagging models and implementations

Most implemented papers

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

qiuqiangkong/audioset_tagging_cnn 23 Aug 2020

We transfer PANNs to six audio pattern recognition tasks, and demonstrate state-of-the-art performance in several of those tasks.

Speech Denoising with Deep Feature Losses

anicolson/DeepXi 27 Jun 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

musicnn: Pre-trained convolutional neural networks for music audio tagging

jordipons/musicnn 14 Sep 2019

Pronounced as "musician", the musicnn library contains a set of pre-trained musically motivated convolutional neural networks for music audio tagging: https://github. com/jordipons/musicnn.

AST: Audio Spectrogram Transformer

YuanGongND/ast 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

iooops/CS221-Audio-Tagging 26 Jul 2018

The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology.

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

yongxuUSTC/aDAE_DNN_audio_tagging 13 Jul 2016

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging

yongxuUSTC/cnn_rnn_spatial_audio_tagging 24 Feb 2017

In this paper, we propose to use a convolutional neural network (CNN) to extract robust features from mel-filter banks (MFBs), spectrograms or even raw waveforms for audio tagging.

Speech Denoising Convolutional Neural Network trained with Deep Feature Losses.

francoisgermain/SpeechDenoisingWithDeepFeatureLosses Interspeech 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

General audio tagging with ensembling convolutional neural network and statistical features

Cocoxili/DCASE2018Task2 30 Oct 2018

Audio tagging is challenging due to the limited size of data and noisy labels.

Audio tagging with noisy labels and minimal supervision

lRomul/argus-freesound 7 Jun 2019

The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.