Search Results for author: Simon Leglaive

Found 18 papers, 4 papers with code

VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space

no code implementations13 Dec 2023 Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno-Noguer

Instead of predicting body model parameters or 3D vertex coordinates, our focus is on forecasting the proposed discrete latent representation, which can be decoded into a registered human mesh.

Unsupervised speech enhancement with deep dynamical generative speech and noise models

no code implementations13 Jun 2023 Xiaoyu Lin, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda

This work builds on a previous work on unsupervised speech enhancement using a dynamical variational autoencoder (DVAE) as the clean speech model and non-negative matrix factorization (NMF) as the noise model.

Speech Enhancement

A multimodal dynamical variational autoencoder for audiovisual speech representation learning

no code implementations5 May 2023 Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier

The latent space is structured to dissociate the latent dynamical factors that are shared between the modalities from those that are specific to each modality.

Disentanglement Image Denoising +2

A vector quantized masked autoencoder for audiovisual speech emotion recognition

no code implementations5 May 2023 Samir Sadok, Simon Leglaive, Renaud Séguier

While fully-supervised models have been shown to be effective for audiovisual speech emotion recognition (SER), the limited availability of labeled data remains a major challenge in the field.

Representation Learning Self-Supervised Learning +1

A vector quantized masked autoencoder for speech emotion recognition

no code implementations21 Apr 2023 Samir Sadok, Simon Leglaive, Renaud Séguier

The VQ-MAE-S model is based on a masked autoencoder (MAE) that operates in the discrete latent space of a vector-quantized variational autoencoder.

Self-Supervised Learning Speech Emotion Recognition

LatentForensics: Towards frugal deepfake detection in the StyleGAN latent space

no code implementations30 Mar 2023 Matthieu Delmas, Amine Kacete, Stephane Paquelet, Simon Leglaive, Renaud Seguier

Combined with other recent studies on the interpretation and manipulation of this latent space, we believe that the proposed approach can further help in developing frugal deepfake classification methods based on interpretable high-level properties of face images.

Binary Classification Classification +3

Speech Modeling with a Hierarchical Transformer Dynamical VAE

no code implementations7 Mar 2023 Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda

The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep generative models that extends the VAE to model a sequence of observed data and a corresponding sequence of latent vectors.

Speech Enhancement

Learning and controlling the source-filter representation of speech with a variational autoencoder

1 code implementation14 Apr 2022 Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier

Using only a few seconds of labeled speech signals generated with an artificial speech synthesizer, we propose a method to identify the latent subspaces encoding $f_0$ and the first three formant frequencies, we show that these subspaces are orthogonal, and based on this orthogonality, we develop a method to accurately and independently control the source-filter speech factors within the latent subspaces.

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

1 code implementation23 Jun 2021 Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin

We propose an unsupervised speech enhancement algorithm that combines a DVAE speech prior pre-trained on clean speech signals with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement.

Representation Learning Speech Enhancement +2

Dynamical Variational Autoencoders: A Comprehensive Review

1 code implementation28 Aug 2020 Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda

Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models.

3D Human Dynamics Resynthesis +2

A Recurrent Variational Autoencoder for Speech Enhancement

no code implementations24 Oct 2019 Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE).

Speech Enhancement

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders

no code implementations7 Aug 2019 Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud

Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data.

Speech Enhancement

A variance modeling framework based on variational autoencoders for speech enhancement

1 code implementation5 Feb 2019 Simon Leglaive, Laurent Girin, Radu Horaud

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.