In this paper we investigate the importance of the extent of memory in sequential self attention for sound recognition.
The neural adapter layer facilitates the target model to learn new sound events with minimal training data and maintaining the performance of the previously learned sound events similar to the source model.
In this study, we examine whether we can analyze and compare Western and Chinese classical music based on soundscape models.
This paper addresses the application of sound event detection at the edge, by optimizing deep learning techniques on resource-constrained embedded platforms for the IoT.
This paper considers a semi-supervised learning framework for weakly labeled polyphonic sound event detection problems for the DCASE 2019 challenge's task4 by combining both the tri-training and adversarial learning.
In this study, we introduce a convolutional time-frequency-channel "Squeeze and Excitation" (tfc-SE) module to explicitly model inter-dependencies between the time-frequency domain and multiple channels.
Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity.
Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings.