Audio Tagging

41 papers with code • 1 benchmarks • 8 datasets

Audio tagging is a task to predict the tags of audio clips. Audio tagging tasks include music tagging, acoustic scene classification, audio event classification, etc.

Libraries

Use these libraries to find Audio Tagging models and implementations

Latest papers with no code

Multi-encoder attention-based architectures for sound recognition with partial visual assistance

no code yet • 26 Sep 2022

Large-scale sound recognition data sets typically consist of acoustic recordings obtained from multimedia libraries.

Impact of temporal resolution on convolutional recurrent networks for audio tagging and sound event detection

no code yet • 26 Sep 2022

Many state-of-the-art systems for audio tagging and sound event detection employ convolutional recurrent neural architectures.

Improved Zero-Shot Audio Tagging & Classification with Patchout Spectrogram Transformers

no code yet • 24 Aug 2022

Standard machine learning models for tagging and classifying acoustic signals cannot handle classes that were not seen during training.

Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer

no code yet • ACL ARR January 2022

Our key idea is to share the image modality between bi-modal image-text representations and bi-modal image-audio representations; the image modality functions as a pivot and connects audio and text in a tri-modal embedding space implicitly. In a difficult zero-shot setting with no paired audio-text data, our model demonstrates state-of-the-art zero-shot performance on the ESC50 and US8K audio classification tasks, and even surpasses the supervised state of the art for Clotho caption retrieval (with audio queries) by 2. 2% R@1.

Audiovisual transfer learning for audio tagging and sound event detection

no code yet • 9 Jun 2021

We study the merit of transfer learning for two sound recognition problems, i. e., audio tagging and sound event detection.

ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition

no code yet • 3 Jun 2021

For the RAVDESS dataset, our system is 3. 3x smaller than the previous best system.

What is the ground truth? Reliability of multi-annotator data for audio tagging

no code yet • 9 Apr 2021

Crowdsourcing has become a common approach for annotating large amounts of data.

Joint framework with deep feature distillation and adaptive focal loss for weakly supervised audio tagging and acoustic event detection

no code yet • 23 Mar 2021

A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED) simultaneously.

Enhancing Audio Augmentation Methods with Consistency Learning

no code yet • 9 Feb 2021

For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss.

Audio Tagging by Cross Filtering Noisy Labels

no code yet • 16 Jul 2020

Yet, it is labor-intensive to accurately annotate large amount of audio data, and the dataset may contain noisy labels in the practical settings.