Audio Classification

131 papers with code • 23 benchmarks • 34 datasets

Audio Classification is a machine learning task that involves identifying and tagging audio signals into different classes or categories. The goal of audio classification is to enable machines to automatically recognize and distinguish between different types of audio, such as music, speech, and environmental sounds.

Libraries

Use these libraries to find Audio Classification models and implementations
3 papers
22
2 papers
2,987
See all 7 libraries.

Most implemented papers

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

MrtnMndt/OCDVAE_ContinualLearning 28 May 2019

Modern deep neural networks are well known to be brittle in the face of unknown data instances and recognition of the latter remains a challenge.

Rethinking CNN Models for Audio Classification

kamalesh0406/Audio-Classification 22 Jul 2020

Besides, we show that even though we use the pretrained model weights for initialization, there is variance in performance in various output runs of the same model.

AST: Audio Spectrogram Transformer

YuanGongND/ast 5 Apr 2021

In the past decade, convolutional neural networks (CNNs) have been widely adopted as the main building block for end-to-end audio classification models, which aim to learn a direct mapping from audio spectrograms to corresponding labels.

AudioMNIST: Exploring Explainable Artificial Intelligence for Audio Analysis on a Simple Benchmark

soerenab/AudioMNIST 9 Jul 2018

Explainable Artificial Intelligence (XAI) is targeted at understanding how models perform feature selection and derive their classification decisions.

Specifying Weight Priors in Bayesian Deep Neural Networks with Empirical Bayes

IntelLabs/bayesian-torch 12 Jun 2019

We propose MOdel Priors with Empirical Bayes using DNN (MOPED) method to choose informed weight priors in Bayesian neural networks.

$Π-$nets: Deep Polynomial Neural Networks

grigorisg9gr/polynomial_nets 8 Mar 2020

Deep Convolutional Neural Networks (DCNNs) is currently the method of choice both for generative, as well as for discriminative learning in computer vision and machine learning.

Generalised Interpretable Shapelets for Irregular Time Series

patrick-kidger/generalised_shapelets 28 May 2020

The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'.

CRNNs for Urban Sound Tagging with spatiotemporal context

multitel-ai/urban-sound-tagging 24 Aug 2020

This paper describes CRNNs we used to participate in Task 5 of the DCASE 2020 challenge.

Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast

satvik-venkatesh/audio-seg-data-synth 19 Feb 2021

It is useful as a pre-processing step to index, store, and modify audio recordings, radio broadcasts and TV programmes.

Slow-Fast Auditory Streams For Audio Recognition

ekazakos/auditory-slow-fast 5 Mar 2021

We propose a two-stream convolutional network for audio recognition, that operates on time-frequency spectrogram inputs.