Search Results for author: Jahangir Alam

Found 12 papers, 2 papers with code

An end-to-end approach for the verification problem: learning the right distance

1 code implementation ICML 2020 Joao Monteiro, Isabela Albuquerque, Jahangir Alam, R. Devon Hjelm, Tiago Falk

In this contribution, we augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder.

Metric Learning

Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition

1 code implementation20 Mar 2024 R. Gnana Praveen, Jahangir Alam

In particular, we compute the attention weights based on cross-correlation between the joint audio-visual-text feature representations and the feature representations of individual modalities to simultaneously capture intra- and intermodal relationships across the modalities.

Multimodal Emotion Recognition

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification

no code implementations7 Nov 2018 Gautam Bhattacharya, Joao Monteiro, Jahangir Alam, Patrick Kenny

Furthermore, we are able to significantly boost verification performance by averaging our different GAN models at the score level, achieving a relative improvement of 7. 2% over the baseline.

Dimensionality Reduction Speaker Verification

Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training

no code implementations7 Nov 2018 Gautam Bhattacharya, Jahangir Alam, Patrick Kenny

In this article we propose a novel approach for adapting speaker embeddings to new domains based on adversarial training of neural networks.

Text-Independent Speaker Verification

Learning Semantic Similarities for Prototypical Classifiers

no code implementations1 Jan 2021 Joao Monteiro, Isabela Albuquerque, Jahangir Alam, Tiago Falk

Recent metric learning approaches parametrize semantic similarity measures through the use of an encoder trained along with a similarity model, which operates over pairs of representations.

Few-Shot Learning Metric Learning +5

Robust Speech Representation Learning via Flow-based Embedding Regularization

no code implementations7 Dec 2021 Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan

Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals.

Language Identification Representation Learning +1

Attentive activation function for improving end-to-end spoofing countermeasure systems

no code implementations3 May 2022 Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan

The main objective of the spoofing countermeasure system is to detect the artifacts within the input speech caused by the speech synthesis or voice conversion process.

Speech Synthesis Voice Conversion

Audio-Visual Speaker Verification via Joint Cross-Attention

no code implementations28 Sep 2023 R. Gnana Praveen, Jahangir Alam

We have shown that efficiently leveraging the intra- and inter-modal relationships significantly improves the performance of audio-visual fusion for speaker verification.

Speaker Verification

Dynamic Cross Attention for Audio-Visual Person Verification

no code implementations7 Mar 2024 R. Gnana Praveen, Jahangir Alam

In this paper, we propose a Dynamic Cross-Attention (DCA) model that can dynamically select the cross-attended or unattended features on the fly based on the strong or weak complementary relationships, respectively, across audio and visual modalities.

Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention

no code implementations7 Mar 2024 R. Gnana Praveen, Jahangir Alam

In this paper, we have investigated the prospect of effectively capturing both the intra- and inter-modal relationships across audio and visual modalities, which can play a crucial role in significantly improving the fusion performance over unimodal systems.

Cross-Attention is Not Always Needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition

no code implementations28 Mar 2024 R. Gnana Praveen, Jahangir Alam

We also compare the proposed approach with other variants of cross-attention and show that the proposed model consistently improves the performance on both datasets.

Emotion Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.