Search Results for author: Jahangir Alam

Found 12 papers, 2 papers with code

An end-to-end approach for the verification problem: learning the right distance

1 code implementation • ICML 2020 • Joao Monteiro, Isabela Albuquerque, Jahangir Alam, R. Devon Hjelm, Tiago Falk

In this contribution, we augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder.

Metric Learning

Paper
Code

Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition

1 code implementation • 20 Mar 2024 • R. Gnana Praveen, Jahangir Alam

In particular, we compute the attention weights based on cross-correlation between the joint audio-visual-text feature representations and the feature representations of individual modalities to simultaneously capture intra- and intermodal relationships across the modalities.

Multimodal Emotion Recognition

Paper
Code

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-End Speaker Verification

no code implementations • 7 Nov 2018 • Gautam Bhattacharya, Joao Monteiro, Jahangir Alam, Patrick Kenny

Furthermore, we are able to significantly boost verification performance by averaging our different GAN models at the score level, achieving a relative improvement of 7. 2% over the baseline.

Dimensionality Reduction Speaker Verification

Paper
Add Code

Adapting End-to-End Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training

no code implementations • 7 Nov 2018 • Gautam Bhattacharya, Jahangir Alam, Patrick Kenny

In this article we propose a novel approach for adapting speaker embeddings to new domains based on adversarial training of neural networks.

Text-Independent Speaker Verification

Paper
Add Code

Short-duration Speaker Verification (SdSV) Challenge 2021: the Challenge Evaluation Plan

no code implementations • 13 Dec 2019 • Hossein Zeinali, Kong Aik Lee, Jahangir Alam, Lukas Burget

This document describes the Short-duration Speaker Verification (SdSV) Challenge 2021.

Speaker Recognition Text-Dependent Speaker Verification +1

Paper
Add Code

Learning Semantic Similarities for Prototypical Classifiers

no code implementations • 1 Jan 2021 • Joao Monteiro, Isabela Albuquerque, Jahangir Alam, Tiago Falk

Recent metric learning approaches parametrize semantic similarity measures through the use of an encoder trained along with a similarity model, which operates over pairs of representations.

Few-Shot Learning Metric Learning +5

Paper
Add Code

Robust Speech Representation Learning via Flow-based Embedding Regularization

no code implementations • 7 Dec 2021 • Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan

Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals.

Language Identification Representation Learning +1

Paper
Add Code

Attentive activation function for improving end-to-end spoofing countermeasure systems

no code implementations • 3 May 2022 • Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan

The main objective of the spoofing countermeasure system is to detect the artifacts within the input speech caused by the speech synthesis or voice conversion process.

Speech Synthesis Voice Conversion

Paper
Add Code

Audio-Visual Speaker Verification via Joint Cross-Attention

no code implementations • 28 Sep 2023 • R. Gnana Praveen, Jahangir Alam

We have shown that efficiently leveraging the intra- and inter-modal relationships significantly improves the performance of audio-visual fusion for speaker verification.

Speaker Verification

Paper
Add Code

Dynamic Cross Attention for Audio-Visual Person Verification

no code implementations • 7 Mar 2024 • R. Gnana Praveen, Jahangir Alam

In this paper, we propose a Dynamic Cross-Attention (DCA) model that can dynamically select the cross-attended or unattended features on the fly based on the strong or weak complementary relationships, respectively, across audio and visual modalities.

Paper
Add Code

Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention

no code implementations • 7 Mar 2024 • R. Gnana Praveen, Jahangir Alam

In this paper, we have investigated the prospect of effectively capturing both the intra- and inter-modal relationships across audio and visual modalities, which can play a crucial role in significantly improving the fusion performance over unimodal systems.

Paper
Add Code

Cross-Attention is Not Always Needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition

no code implementations • 28 Mar 2024 • R. Gnana Praveen, Jahangir Alam

We also compare the proposed approach with other variants of cross-attention and show that the proposed model consistently improves the performance on both datasets.

Emotion Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.