Audio Tagging

39 papers with code • 1 benchmarks • 7 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Most implemented papers

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

qiuqiangkong/audioset_tagging_cnn 23 Aug 2020

We transfer PANNs to six audio pattern recognition tasks, and demonstrate state-of-the-art performance in several of those tasks.

Speech Denoising with Deep Feature Losses

anicolson/DeepXi 27 Jun 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

musicnn: Pre-trained convolutional neural networks for music audio tagging

jordipons/musicnn 14 Sep 2019

Pronounced as "musician", the musicnn library contains a set of pre-trained musically motivated convolutional neural networks for music audio tagging: https://github. com/jordipons/musicnn.

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

iooops/CS221-Audio-Tagging 26 Jul 2018

The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology.

AST: Audio Spectrogram Transformer

YuanGongND/ast 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

yongxuUSTC/aDAE_DNN_audio_tagging 13 Jul 2016

For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.

Convolutional Gated Recurrent Neural Network Incorporating Spatial Features for Audio Tagging

yongxuUSTC/cnn_rnn_spatial_audio_tagging 24 Feb 2017

In this paper, we propose to use a convolutional neural network (CNN) to extract robust features from mel-filter banks (MFBs), spectrograms or even raw waveforms for audio tagging.

Speech Denoising Convolutional Neural Network trained with Deep Feature Losses.

francoisgermain/SpeechDenoisingWithDeepFeatureLosses Interspeech 2018

We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.

General audio tagging with ensembling convolutional neural network and statistical features

Cocoxili/DCASE2018Task2 30 Oct 2018

Audio tagging is challenging due to the limited size of data and noisy labels.

Audio tagging with noisy labels and minimal supervision

lRomul/argus-freesound 7 Jun 2019

The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.