Pronounced as "musician", the musicnn library contains a set of pre-trained musically motivated convolutional neural networks for music audio tagging: https://github. com/jordipons/musicnn.
The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.
We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging.
Audio tagging is challenging due to the limited size of data and noisy labels.
Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.
For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.
This paper describes CRNNs we used to participate in Task 5 of the DCASE 2020 challenge.
Sometimes authors copy-pasting the results of the original papers which is not helping reproducibility.