Audio Classification

132 papers with code • 20 benchmarks • 35 datasets

Audio Classification is a machine learning task that involves identifying and tagging audio signals into different classes or categories. The goal of audio classification is to enable machines to automatically recognize and distinguish between different types of audio, such as music, speech, and environmental sounds.

Libraries

Use these libraries to find Audio Classification models and implementations
3 papers
22
2 papers
3,000
See all 7 libraries.

Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

alibaba-damo-academy/FunASR 14 Nov 2023

Recently, instruction-following audio-language models have received broad attention for audio interaction with humans.

3,370
14 Nov 2023

Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance

kaen2891/adversarial_fine-tuning_using_generated_respiratory_sound 11 Nov 2023

In this work, we propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder.

13
11 Nov 2023

Auto deep learning for bioacoustic signals

giuliotosato/autokeras-bioacustic 8 Nov 2023

This study investigates the potential of automated deep learning to enhance the accuracy and efficiency of multi-class classification of bird vocalizations, compared against traditional manually-designed deep learning models.

4
08 Nov 2023

Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models

fschmid56/efficientat 24 Oct 2023

Audio Spectrogram Transformers are excellent at exploiting large datasets, creating powerful pre-trained models that surpass CNNs when fine-tuned on downstream tasks.

183
24 Oct 2023

CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition

knoriy/CLARA 18 Oct 2023

Using a large multilingual audio corpus and self-supervised learning, CLARA develops speech representations enriched with emotions, advancing emotion-aware multilingual speech processing.

59
18 Oct 2023

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

PKU-YuanGroup/Video-LLaVA 3 Oct 2023

We thus propose VIDAL-10M with Video, Infrared, Depth, Audio and their corresponding Language, naming as VIDAL-10M.

2,413
03 Oct 2023

Audio classification with Dilated Convolution with Learnable Spacings

k-h-ismail/dilated-convolution-with-learnable-spacings-pytorch 25 Sep 2023

Dilated convolution with learnable spacings (DCLS) is a recent convolution method in which the positions of the kernel elements are learned throughout training by backpropagation.

50
25 Sep 2023

EDAC: Efficient Deployment of Audio Classification Models For COVID-19 Detection

edac-ml4h/edac-ml4h 11 Sep 2023

Various researchers made use of machine learning methods in an attempt to detect COVID-19.

0
11 Sep 2023

AudRandAug: Random Image Augmentations for Audio Classification

turab45/audrandaug 9 Sep 2023

To address this gap, we introduce AudRandAug, an adaptation of RandAug for audio data.

1
09 Sep 2023

Global birdsong embeddings enable superior transfer learning for bioacoustic classification

facebookresearch/audiomae 12 Jul 2023

With the advent of deep learning models, classification of important signals from these datasets has markedly improved.

475
12 Jul 2023