However, there is limited research on pretraining neural networks on large datasets for audio pattern recognition.
ACOUSTIC SCENE CLASSIFICATION AUDIO TAGGING SCENE CLASSIFICATION SOUND EVENT DETECTION
Pronounced as "musician", the musicnn library contains a set of pre-trained musically motivated convolutional neural networks for music audio tagging: https://github. com/jordipons/musicnn.
The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.
We bridge the connection between attention neural networks and multiple instance learning (MIL) methods, and propose decision-level and feature-level attention neural networks for audio tagging.
We present an end-to-end deep learning approach to denoising speech signals by processing the raw waveform directly.
Audio tagging is challenging due to the limited size of data and noisy labels.
Single task deep neural networks that perform a target task among diverse cross-related tasks in the acoustic scene and event literature are being developed.
ACOUSTIC SCENE CLASSIFICATION AUDIO TAGGING SOUND EVENT DETECTION AUDIO AND SPEECH PROCESSING
For the unsupervised feature learning, we propose to use a symmetric or asymmetric deep de-noising auto-encoder (sDAE or aDAE) to generate new data-driven features from the Mel-Filter Banks (MFBs) features.
This paper describes CRNNs we used to participate in Task 5 of the DCASE 2020 challenge.
Sometimes authors copy-pasting the results of the original papers which is not helping reproducibility.