The artist similarity quest has become a crucial subject in social and scientific contexts.
Sound design involves creatively selecting, recording, and editing sound effects for various media like cinema, video games, and virtual/augmented reality.
The proposed quaternion wavelet network (QUAVE) can be easily integrated with any pre-existing medical image analysis or synthesis task, and it can be involved with real, quaternion, or hypercomplex-valued models, generalizing their adoption to single-channel data.
To overcome these limitations, we employ a dual quaternion representation of rigid motions in the 3D space that jointly describes rotations and translations of point sets, processing each of the points as a single entity.
In this step, a parameterized hypercomplex neural network (PHNN) is employed to perform breast cancer classification.
Neural models based on hypercomplex algebra systems are growing and prolificating for a plethora of applications, ranging from computer vision to natural language processing.
Multimodal emotion recognition from physiological signals is receiving an increasing amount of attention due to the impossibility to control them at will unlike behavioral reactions, thus providing more reliable information.
Semantic communication is poised to play a pivotal role in shaping the landscape of future AI-driven communication systems.
We prove, through an in-depth assessment of multiple scenarios, that our method outperforms existing solutions in generating high-quality images with preserved semantic information even in cases where the received content is significantly degraded.
We test our model on aerial images of the DroneVeichle dataset containing RGB-IR paired images.
Image-to-image translation (I2I) aims at transferring the content representation from an input domain to an output one, bouncing along different target domains.
Ranked #3 on Image-to-Image Translation on CelebA-HQ
The proposed methods are able to handle the information of a patient altogether without breaking the multi-view nature of the exam.
On the one hand, the classifier permits to optimize each latent axis of the embeddings for the classification of a specific emotion-related characteristic: valence, arousal, dominance and overall emotion.
We show that our dual quaternion SELD model with temporal convolution blocks (DualQSELD-TCN) achieves better results with respect to real and quaternion-valued baselines thanks to our augmented representation of the sound field.
Ranked #1 on Sound Event Localization and Detection on L3DAS21
The L3DAS22 Challenge is aimed at encouraging the development of machine learning strategies for 3D speech enhancement and 3D sound localization and detection in office-like environments.
In this paper, we define the parameterization of hypercomplex convolutional layers and introduce the family of parameterized hypercomplex neural networks (PHNNs) that are lightweight and efficient large-scale models.
Ranked #1 on Sound Event Detection on L3DAS21
In order to make this class of functional link adaptive filters (FLAFs) efficient, we propose low-complexity expansions and frequency-domain adaptation of the parameters.
Latest Generative Adversarial Networks (GANs) are gathering outstanding results through a large-scale training, thus employing models composed of millions of parameters requiring extensive computational capabilities.
Ranked #1 on Image Generation on Oxford 102 Flowers 128x128
1 code implementation • 12 Apr 2021 • Eric Guizzo, Riccardo F. Gramaccioni, Saeid Jamili, Christian Marinoni, Edoardo Massaro, Claudia Medaglia, Giuseppe Nachira, Leonardo Nucciarelli, Ludovica Paglialunga, Marco Pennese, Sveva Pepe, Enrico Rocchi, Aurelio Uncini, Danilo Comminiello
The L3DAS21 Challenge is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD).
Nonlinear adaptive filters often show some sparse behavior due to the fact that not all the coefficients are equally useful for the modeling of any nonlinearity.
In this paper, we present a deep learning method that is able to reconstruct subsampled MR images obtained by reducing the k-space data, while maintaining a high image quality that can be used to observe brain lesions.
To this end, we investigate two extensions of l1 and structured regularization to the quaternion domain.
Complex-valued neural networks (CVNNs) have been shown to be powerful nonlinear approximators when the input data can be properly modeled in the complex domain.
Learning from data in the quaternion domain enables us to exploit internal dependencies of 4D signals and treating them as a single entity.
Gated recurrent neural networks have achieved remarkable results in the analysis of sequential data.
Graph neural networks (GNNs) are a class of neural networks that allow to efficiently perform inference on data that is associated to a graph structure, such as, e. g., citation networks or knowledge graphs.
In this paper, we consider the joint task of simultaneously optimizing (i) the weights of a deep neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i. e., feature selection).
In this paper, we derive a modified InfoMax algorithm for the solution of Blind Signal Separation (BSS) problems by using advanced stochastic methods.