Audio Classification

131 papers with code • 23 benchmarks • 34 datasets

Audio Classification is a machine learning task that involves identifying and tagging audio signals into different classes or categories. The goal of audio classification is to enable machines to automatically recognize and distinguish between different types of audio, such as music, speech, and environmental sounds.

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Classification

Dataset	Best Model	Compare
AudioSet	OmniVec	See all
ESC-50	InternVideo2	See all
VGGSound	Mirasol3B	See all
ICBHI Respiratory Sound Database	AST (Patch-Mix CL)	See all
SHD	SNN with Dilated Convolution with Learnable Spacings	See all
FSD50K	ONE-PEACE	See all
Speech Commands	AST-S	See all
DCASE	CrissCross (AudioSet)	See all
Balanced Audio Set	BEATs	See all
EPIC-KITCHENS-100	Audiovisual Masked Autoencoder (Audiovisual, Single)	See all
SSC	SNN with Dilated Convolution with Learnable Spacings	See all
BirdCLEF 2021	EfficientLEAF (8s)	See all
DiCOVA	AUCO ResNet	See all
CREMA-D	EfficientLEAF	See all
RAVDESS	ASM-RH-A	See all
VocalSound	VocalSound Baseline	See all
Multimodal PISA	MMDL	See all
UCR Time Series Classification Archive	CDIL	See all
DEEP-VOICE: DeepFake Voice Recognition	XGBoost (330)	See all
EPIC-SOUNDS	Mirasol3B (A+V)	See all

Show all 20 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Audio Classification models and implementations

Sreyan88/LAPE

3 papers

towhee-io/towhee

2 papers

2,987

google-research/leaf-audio

2 papers

473

fschmid56/efficientat

2 papers

180

See all 7 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

MrtnMndt/OCDVAE_ContinualLearning • • 28 May 2019

Modern deep neural networks are well known to be brittle in the face of unknown data instances and recognition of the latter remains a challenge.

Paper
Code

Rethinking CNN Models for Audio Classification

kamalesh0406/Audio-Classification • • 22 Jul 2020

Besides, we show that even though we use the pretrained model weights for initialization, there is variance in performance in various output runs of the same model.

Paper
Code

AST: Audio Spectrogram Transformer

YuanGongND/ast • • 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.

Paper
Code

AudioMNIST: Exploring Explainable Artificial Intelligence for Audio Analysis on a Simple Benchmark

soerenab/AudioMNIST • 9 Jul 2018

Explainable Artificial Intelligence (XAI) is targeted at understanding how models perform feature selection and derive their classification decisions.

Paper
Code

Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes

IntelLabs/bayesian-torch • • 12 Jun 2019

We propose MOdel Priors with Empirical Bayes using DNN (MOPED) method to choose informed weight priors in Bayesian neural networks.

Paper
Code

$Π-$nets: Deep Polynomial Neural Networks

grigorisg9gr/polynomial_nets • • 8 Mar 2020

Deep Convolutional Neural Networks (DCNNs) is currently the method of choice both for generative, as well as for discriminative learning in computer vision and machine learning.

Paper
Code

Generalised Interpretable Shapelets for Irregular Time Series

patrick-kidger/generalised_shapelets • • 28 May 2020

The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'.

Paper
Code