Audio Classification

85 papers with code • 17 benchmarks • 24 datasets

Audio classification or audio tagging are tasks to predict the tags of audio clips.


Use these libraries to find Audio Classification models and implementations

Most implemented papers

CNN Architectures for Large-Scale Audio Classification

towhee-io/towhee 29 Sep 2016

Convolutional Neural Networks (CNNs) have proven very effective in image classification and show promise for audio.

Perceiver: General Perception with Iterative Attention

deepmind/deepmind-research 4 Mar 2021

The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models.

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition

qiuqiangkong/audioset_tagging_cnn 23 Aug 2020

We transfer PANNs to six audio pattern recognition tasks, and demonstrate state-of-the-art performance in several of those tasks.

Multi-level Attention Model for Weakly Supervised Audio Classification

IBM/MAX-Audio-Classifier 6 Mar 2018

The objective of audio classification is to predict the presence or absence of audio events in an audio clip.

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

clovaai/AdamP ICLR 2021

Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers.

Convolutional RNN: an Enhanced Model for Extracting Features from Sequential Data

cruvadom/Convolutional-RNN 18 Feb 2016

Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input.

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

MrtnMndt/OCDVAE_ContinualLearning 28 May 2019

Modern deep neural networks are well known to be brittle in the face of unknown data instances and recognition of the latter remains a challenge.

Rethinking CNN Models for Audio Classification

kamalesh0406/Audio-Classification 22 Jul 2020

Besides, we show that even though we use the pretrained model weights for initialization, there is variance in performance in various output runs of the same model.

LEAF: A Learnable Frontend for Audio Classification

google-research/leaf-audio 21 Jan 2021

In this work we show that we can train a single learnable frontend that outperforms mel-filterbanks on a wide range of audio signals, including speech, music, audio events and animal sounds, providing a general-purpose learned frontend for audio classification.

Interpreting and Explaining Deep Neural Networks for Classification of Audio Signals

soerenab/AudioMNIST 9 Jul 2018

Interpretability of deep neural networks is a recently emerging area of machine learning research targeting a better understanding of how models perform feature selection and derive their classification decisions.