The proposed method uses an auxiliary classifier, trained on data that is known to be in-distribution, for detection and relabelling.
Audio data augmentation is a key step in training deep neural networks for solving audio classification tasks.
Long-range dependencies modeling, widely used in capturing spatiotemporal correlation, has shown to be effective in CNN dominated computer vision tasks.
#18 best model for Object Detection on PASCAL VOC 2007
We introduce a probabilistic approach to unify deep continual learning with open set recognition, based on variational Bayesian inference.
In this paper, we investigate how to learn rich and robust feature representations for audio classification from visual data and acoustic images, a novel audio data modality.
Interpretability of deep neural networks is a recently emerging area of machine learning research targeting a better understanding of how models perform feature selection and derive their classification decisions.
Based on this, we introduce a method for descriptor-based synthesis and show that we can control the descriptors of an instrument while keeping its timbre structure.
In this work, we first describe a CNN based approach for weakly supervised training of audio events.