Search Results for author: Mostafa Sadeghi

Found 25 papers, 2 papers with code

Unsupervised speech enhancement with diffusion-based generative models

2 code implementations19 Sep 2023 Berné Nortier, Mostafa Sadeghi, Romain Serizel

To address this issue, we introduce an alternative approach that operates in an unsupervised manner, leveraging the generative power of diffusion models.

Speech Enhancement

Progressive Learning for Systematic Design of Large Neural Networks

1 code implementation23 Oct 2017 Saikat Chatterjee, Alireza M. Javid, Mostafa Sadeghi, Partha P. Mitra, Mikael Skoglund

The developed network is expected to show good generalization power due to appropriate regularization and use of random weights in the layers.

Optimization of Clustering for Clustering-based Image Denoising

no code implementations12 Jun 2013 Mohsen Joneidi, Mostafa Sadeghi

In this paper, the problem of de-noising of an image contaminated with additive white Gaussian noise (AWGN) is studied.

Clustering Dictionary Learning +1

SSFN -- Self Size-estimating Feed-forward Network with Low Complexity, Limited Need for Human Intervention, and Consistent Behaviour across Trials

no code implementations17 May 2019 Saikat Chatterjee, Alireza M. Javid, Mostafa Sadeghi, Shumpei Kikuta, Dong Liu, Partha P. Mitra, Mikael Skoglund

We design a self size-estimating feed-forward network (SSFN) using a joint optimization approach for estimation of number of layers, number of nodes and learning of weight matrices.

Image Classification

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders

no code implementations7 Aug 2019 Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud

Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data.

Speech Enhancement

Robust Unsupervised Audio-visual Speech Enhancement Using a Mixture of Variational Autoencoders

no code implementations10 Nov 2019 Mostafa Sadeghi, Xavier Alameda-Pineda

When visual data is clean, speech enhancement with audio-visual VAE shows a better performance than with audio-only VAE, which is trained on audio-only data.

Speech Enhancement

Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement

no code implementations23 Dec 2019 Mostafa Sadeghi, Xavier Alameda-Pineda

Two encoder networks input, respectively, audio and visual data, and the posterior of the latent variables is modeled as a mixture of two Gaussian distributions output from each encoder network.

Speech Enhancement Variational Inference

Unsupervised Performance Analysis of 3D Face Alignment with a Statistically Robust Confidence Test

no code implementations14 Apr 2020 Mostafa Sadeghi, Xavier Alameda-Pineda, Radu Horaud

The results show that the proposed analysis is consistent with supervised metrics and that it can be used to measure the accuracy of both predicted landmarks and of automatically annotated 3DFA datasets, to detect errors and to eliminate them.

3D Face Alignment Face Alignment

Deep Variational Generative Models for Audio-visual Speech Separation

no code implementations17 Aug 2020 Viet-Nhat Nguyen, Mostafa Sadeghi, Elisa Ricci, Xavier Alameda-Pineda

To better utilize the visual information, the posteriors of the latent variables are inferred from mixed speech (instead of clean speech) as well as the visual data.

Speech Separation

Face Frontalization Based on Robustly Fitting a Deformable Shape Model to 3D Landmarks

no code implementations26 Oct 2020 Zhiqi Kang, Mostafa Sadeghi, Radu Horaud

We propose to model inliers and outliers with the generalized Student's t-probability distribution function, a heavy-tailed distribution that is immune to non-Gaussian errors in the data.

Face Alignment Face Model +2

Switching Variational Auto-Encoders for Noise-Agnostic Audio-visual Speech Enhancement

no code implementations8 Feb 2021 Mostafa Sadeghi, Xavier Alameda-Pineda

Recently, audio-visual speech enhancement has been tackled in the unsupervised settings based on variational auto-encoders (VAEs), where during training only clean data is used to train a generative model for speech, which at test time is combined with a noise model, e. g. nonnegative matrix factorization (NMF), whose parameters are learned without supervision.

Speech Enhancement

A Sparsity-promoting Dictionary Model for Variational Autoencoders

no code implementations29 Mar 2022 Mostafa Sadeghi, Paul Magron

Structuring the latent space in probabilistic deep generative models, e. g., variational autoencoders (VAEs), is important to yield more expressive models and interpretable representations, and to avoid overfitting.

Variational Inference

Expression-preserving face frontalization improves visually assisted speech processing

no code implementations6 Apr 2022 Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda

The method alternates between the estimation of (i)~the rigid transformation (scale, rotation, and translation) and (ii)~the non-rigid deformation between an arbitrarily-viewed face and a face model.

Face Model Lip Reading +1

Audio-visual speech enhancement with a deep Kalman filter generative model

no code implementations2 Nov 2022 Ali Golmakani, Mostafa Sadeghi, Romain Serizel

Deep latent variable generative models based on variational autoencoder (VAE) have shown promising performance for audiovisual speech enhancement (AVSE).

Speech Enhancement

A weighted-variance variational autoencoder model for speech enhancement

no code implementations2 Nov 2022 Ali Golmakani, Mostafa Sadeghi, Xavier Alameda-Pineda, Romain Serizel

A zero-mean complex-valued Gaussian distribution is usually assumed for the generative model, where the speech information is encoded in the variance as a function of a latent variable.

Speech Enhancement

Fast and efficient speech enhancement with variational autoencoders

no code implementations2 Nov 2022 Mostafa Sadeghi, Romain Serizel

Unsupervised speech enhancement based on variational autoencoders has shown promising performance compared with the commonly used supervised methods.

Computational Efficiency Speech Enhancement +1

Diffusion-based speech enhancement with a weighted generative-supervised learning loss

no code implementations19 Sep 2023 Jean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel

Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods.

Speech Enhancement

Posterior sampling algorithms for unsupervised speech enhancement with recurrent variational autoencoder

no code implementations19 Sep 2023 Mostafa Sadeghi, Romain Serizel

Nevertheless, the involved iterative variational expectation-maximization (VEM) process at test time, which relies on a variational inference method, results in high computational complexity.

Computational Efficiency Speech Enhancement +1

End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis

no code implementations16 Oct 2023 Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

We present an end-to-end multichannel speaker-attributed automatic speech recognition (MC-SA-ASR) system that combines a Conformer-based encoder with multi-frame crosschannel attention and a speaker-attributed Transformer-based decoder.

Automatic Speech Recognition Speaker Identification +2

Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications

no code implementations11 Mar 2024 Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

Past studies on end-to-end meeting transcription have focused on model architecture and have mostly been evaluated on simulated meeting data.

Action Detection Activity Detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.