no code implementations • 15 Mar 2023 • Maximo Cobos, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti
Acoustic signal processing in the spherical harmonics domain (SHD) is an active research area that exploits the signals acquired by higher order microphone arrays.
no code implementations • 30 Jul 2021 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Maximo Cobos, Francesc J. Ferri, Pedro Zuccarello
Acoustic scene classification (ASC) is one of the most popular problems in the field of machine listening.
no code implementations • 30 Jul 2021 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Francesc J. Ferri, Maximo Cobos
This year, it has been decided to study how this technique improves each of the datasets (last year only the MIC dataset was studied).
Direction of Arrival Estimation Sound Event Localization and Detection
no code implementations • 28 Jul 2021 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Aaron Lopez-Garcia, Pedro Zuccarello, Maximo Cobos, Francesc J. Ferri
The early fusion stage combines features resulting from the last convolutional block of the respective subnetworks at different time steps to feed a bidirectional recurrent structure.
no code implementations • 27 Jun 2020 • Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos
Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline.
no code implementations • 27 Jun 2020 • Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos
An automated audio captioning system has to be implemented as it accepts an audio as input and outputs as textual description, that is, the caption of the signal.
no code implementations • 20 Mar 2020 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Maximo Cobos
The behavior of the block that implements such operators and, therefore, the entire neural network, can be modified depending on the input to the block, the established residual configurations and the selected non-linear activations.
1 code implementation • 26 Feb 2020 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarrello, Ana M. Torres, Jose J. Lopez, Franscesc J. Ferri, Maximo Cobos
This paper is aimed at poviding the audio recognition community with a carefully annotated dataset (https://zenodo. org/record/3689288) for FSL in an OSR context comprised of 1360 clips from 34 classes divided into pattern sounds} and unwanted sounds.
no code implementations • 26 Jun 2019 • Javier Naranjo-Alcazar, Sergi Perez-Castanos, Irene Martin-Morato, Pedro Zuccarello, Maximo Cobos
The purpose of this paper is to analyze and discuss the performance of several residual block implementations within a state-of-the-art CNN-based architecture for end-to-end audio classification using raw audio waveforms.
no code implementations • 10 Jun 2019 • Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos, Frances J. Ferri
Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks.