Search Results for author: Maximo Cobos

Found 10 papers, 1 papers with code

Acoustic source localization in the spherical harmonics domain exploiting low-rank approximations

no code implementations15 Mar 2023 Maximo Cobos, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti

Acoustic signal processing in the spherical harmonics domain (SHD) is an active research area that exploits the signals acquired by higher order microphone arrays.

Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification

no code implementations28 Jul 2021 Javier Naranjo-Alcazar, Sergi Perez-Castanos, Aaron Lopez-Garcia, Pedro Zuccarello, Maximo Cobos, Francesc J. Ferri

The early fusion stage combines features resulting from the last convolutional block of the respective subnetworks at different time steps to feed a bidirectional recurrent structure.

Scene Classification

Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation

no code implementations27 Jun 2020 Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos

An automated audio captioning system has to be implemented as it accepts an audio as input and outputs as textual description, that is, the caption of the signal.

Audio captioning

Acoustic Scene Classification with Squeeze-Excitation Residual Networks

no code implementations20 Mar 2020 Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Maximo Cobos

The behavior of the block that implements such operators and, therefore, the entire neural network, can be modified depending on the input to the block, the established residual configurations and the selected non-linear activations.

Acoustic Scene Classification Classification +4

An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments

1 code implementation26 Feb 2020 Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarrello, Ana M. Torres, Jose J. Lopez, Franscesc J. Ferri, Maximo Cobos

This paper is aimed at poviding the audio recognition community with a carefully annotated dataset (https://zenodo. org/record/3689288) for FSL in an OSR context comprised of 1360 clips from 34 classes divided into pattern sounds} and unwanted sounds.

Face Recognition Few-Shot Learning +4

On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification

no code implementations26 Jun 2019 Javier Naranjo-Alcazar, Sergi Perez-Castanos, Irene Martin-Morato, Pedro Zuccarello, Maximo Cobos

The purpose of this paper is to analyze and discuss the performance of several residual block implementations within a state-of-the-art CNN-based architecture for end-to-end audio classification using raw audio waveforms.

Audio Classification General Classification +1

CNN depth analysis with different channel inputs for Acoustic Scene Classification

no code implementations10 Jun 2019 Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos, Frances J. Ferri

Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks.

Acoustic Scene Classification Audio Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.