Search Results for author: Simon Leglaive

Found 18 papers, 4 papers with code

VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space

no code implementations13 Dec 2023 Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda, Antonio Agudo, Francesc Moreno-Noguer

Instead of predicting body model parameters or 3D vertex coordinates, our focus is on forecasting the proposed discrete latent representation, which can be decoded into a registered human mesh.

Unsupervised speech enhancement with deep dynamical generative speech and noise models

no code implementations13 Jun 2023 Xiaoyu Lin, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda

This work builds on a previous work on unsupervised speech enhancement using a dynamical variational autoencoder (DVAE) as the clean speech model and non-negative matrix factorization (NMF) as the noise model.

Speech Enhancement

A vector quantized masked autoencoder for audiovisual speech emotion recognition

no code implementations5 May 2023 Samir Sadok, Simon Leglaive, Renaud Séguier

While fully-supervised models have been shown to be effective for audiovisual speech emotion recognition (SER), the limited availability of labeled data remains a major challenge in the field.

Representation Learning Self-Supervised Learning +1

A multimodal dynamical variational autoencoder for audiovisual speech representation learning

no code implementations5 May 2023 Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier

The latent space is structured to dissociate the latent dynamical factors that are shared between the modalities from those that are specific to each modality.

Disentanglement Image Denoising +2

A vector quantized masked autoencoder for speech emotion recognition

no code implementations21 Apr 2023 Samir Sadok, Simon Leglaive, Renaud Séguier

The VQ-MAE-S model is based on a masked autoencoder (MAE) that operates in the discrete latent space of a vector-quantized variational autoencoder.

Self-Supervised Learning Speech Emotion Recognition

Speech Modeling with a Hierarchical Transformer Dynamical VAE

no code implementations7 Mar 2023 Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda

The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep generative models that extends the VAE to model a sequence of observed data and a corresponding sequence of latent vectors.

Speech Enhancement

Learning and controlling the source-filter representation of speech with a variational autoencoder

1 code implementation14 Apr 2022 Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier

Using only a few seconds of labeled speech signals generated with an artificial speech synthesizer, we propose a method to identify the latent subspaces encoding $f_0$ and the first three formant frequencies, we show that these subspaces are orthogonal, and based on this orthogonality, we develop a method to accurately and independently control the source-filter speech factors within the latent subspaces.

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

1 code implementation23 Jun 2021 Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin

We propose an unsupervised speech enhancement algorithm that combines a DVAE speech prior pre-trained on clean speech signals with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement.

Representation Learning Speech Enhancement +2

Dynamical Variational Autoencoders: A Comprehensive Review

1 code implementation28 Aug 2020 Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda

Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models.

3D Human Dynamics Resynthesis +2

A Recurrent Variational Autoencoder for Speech Enhancement

no code implementations24 Oct 2019 Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud

This paper presents a generative approach to speech enhancement based on a recurrent variational autoencoder (RVAE).

Speech Enhancement

Audio-visual Speech Enhancement Using Conditional Variational Auto-Encoders

no code implementations7 Aug 2019 Mostafa Sadeghi, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud

Variational auto-encoders (VAEs) are deep generative latent variable models that can be used for learning the distribution of complex data.

Speech Enhancement

A variance modeling framework based on variational autoencoders for speech enhancement

1 code implementation5 Feb 2019 Simon Leglaive, Laurent Girin, Radu Horaud

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.