no code implementations • 22 Jan 2024 • Jisi Zhang, Vandana Rajan, Haaris Mehmood, David Tuckey, Pablo Peso Parada, Md Asif Jalal, Karthikeyan Saravanan, Gil Ho Lee, Jungin Lee, Seokyeong Jung
On-device Automatic Speech Recognition (ASR) models trained on speech data of a large population might underperform for individuals unseen during training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
1 code implementation • 18 Feb 2022 • Vandana Rajan, Alessio Brutti, Andrea Cavallaro
Generally, models that fuse complementary information from multiple modalities outperform their uni-modal counterparts.
no code implementations • 2 Aug 2021 • Vandana Rajan, Alessio Brutti, Andrea Cavallaro
For this reason, we aim to improve the performance of uni-modal affect recognition models by transferring knowledge from a better-performing (or stronger) modality to a weaker modality during training.
no code implementations • 3 Nov 2020 • Vandana Rajan, Alessio Brutti, Andrea Cavallaro
The proposed multi-modal training framework uses cross-modal translation and correlation-based latent space alignment to improve the representations of the weaker modalities.
1 code implementation • 26 Sep 2019 • Vandana Rajan, Alessio Brutti, Andrea Cavallaro
Computational paralinguistics aims to infer human emotions, personality traits and behavioural patterns from speech signals.