Search Results for author: Thierry Dutoit

Found 44 papers, 10 papers with code

A study of parameters affecting visual saliency assessment

no code implementations • 22 Jul 2013 • Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit

In this paper, a new framework is proposed to assess models of visual saliency.

Paper
Add Code

The AV-LASYN Database : A synchronous corpus of audio and 3D facial marker data for audio-visual laughter synthesis

no code implementations • LREC 2014 • H{\"u}seyin {\c{C}}akmak, J{\'e}r{\^o}me Urbain, Thierry Dutoit, Jo{\"e}lle Tilmanne

Since the aim is to use this database for HMM-based modeling and synthesis, the amount of collected data from one given subject had to be maximized.

Dimensionality Reduction Speech Synthesis

Paper
Add Code

AVAB-DBS: an Audio-Visual Affect Bursts Database for Synthesis

no code implementations • LREC 2016 • Kevin El Haddad, H{\"u}seyin {\c{C}}akmak, St{\'e}phane Dupont, Thierry Dutoit

It has been shown that adding expressivity and emotional expressions to an agent{'}s communication systems would improve the interaction quality between this agent and a human user.

Paper
Add Code

ASR-based Features for Emotion Recognition: A Transfer Learning Approach

no code implementations • WS 2018 • Noé Tits, Kevin El Haddad, Thierry Dutoit

During the last decade, the applications of signal processing have drastically improved with deep learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems

1 code implementation • 25 Jun 2018 • Adaeze Adigwe, Noé Tits, Kevin El Haddad, Sarah Ostadabbas, Thierry Dutoit

In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose.

Speech Emotion Recognition Speech Synthesis +1

214

Paper
Code

Exploring Transfer Learning for Low Resource Emotional TTS

5 code implementations • Advances in Intelligent Systems and Computing 2019 • Noé Tits, Kevin El Haddad, Thierry Dutoit

During the last few years, spoken language technologies have known a big improvement thanks to Deep Learning.

Emotional Speech Synthesis Expressive Speech Synthesis +2

374

Paper
Code

Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis

1 code implementation • 27 Mar 2019 • Noé Tits, Fengna Wang, Kevin El Haddad, Vincent Pagel, Thierry Dutoit

The field of Text-to-Speech has experienced huge improvements last years benefiting from deep learning techniques.

Emotional Speech Synthesis Expressive Speech Synthesis +3

Paper
Code

The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach

no code implementations • 14 Oct 2019 • Noé Tits, Kevin El Haddad, Thierry Dutoit

Finally, we focus on the last one, with the last techniques modeling Text-to-Speech synthesis as a sequence-to-sequence problem.

Expressive Speech Synthesis Sociology +1

Paper
Add Code

Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

no code implementations • 28 Dec 2019 • Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, Thierry Dutoit

The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA).

Event Detection

Paper
Add Code

A Comparative Study of Glottal Source Estimation Techniques

no code implementations • 28 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit

Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals.

Paper
Add Code

Glottal Closure and Opening Instant Detection from Speech Signals

no code implementations • 28 Dec 2019 • Thomas Drugman, Thierry Dutoit

This paper proposes a new procedure to detect Glottal Closure and Opening Instants (GCIs and GOIs) directly from speech waveforms.

Position

Paper
Add Code

A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis

no code implementations • 29 Dec 2019 • Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit

For this, we hereby propose an adaptation of the Deterministic plus Stochastic Model (DSM) for the residual.

Speech Synthesis

Paper
Add Code

Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation

no code implementations • 29 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit

Homomorphic analysis is a well-known method for the separation of non-linearly combined signals.

Paper
Add Code

The Deterministic plus Stochastic Model of the Residual Signal and its Applications

no code implementations • 29 Dec 2019 • Thomas Drugman, Thierry Dutoit

The applicability of the DSM in two fields of speech processing is then studied.

Speaker Identification Speech Synthesis

Paper
Add Code

Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation

no code implementations • 30 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit

Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met.

Paper
Add Code

Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis

no code implementations • 30 Dec 2019 • Thomas Drugman, Alexis Moinet, Thierry Dutoit, Geoffrey Wilfart

The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input.

Speech Synthesis

Paper
Add Code

A Comparative Evaluation of Pitch Modification Techniques

no code implementations • 2 Jan 2020 • Thomas Drugman, Thierry Dutoit

This paper addresses the problem of pitch modification, as an important module for an efficient voice transformation system.

Paper
Add Code

Eigenresiduals for improved Parametric Speech Synthesis

no code implementations • 2 Jan 2020 • Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit

Statistical parametric speech synthesizers have recently shown their ability to produce natural-sounding and flexible voices.

Speech Synthesis

Paper
Add Code

On the Mutual Information between Source and Filter Contributions for Voice Pathology Detection

no code implementations • 2 Jan 2020 • Thomas Drugman, Thomas Dubuisson, Thierry Dutoit

This paper addresses the problem of automatic detection of voice pathologies directly from the speech signal.

Paper
Add Code

Excitation-based Voice Quality Analysis and Modification

no code implementations • 2 Jan 2020 • Thomas Drugman, Thierry Dutoit, Baris Bozkurt

This paper investigates the differences occuring in the excitation for different voice qualities.

Speech Synthesis

Paper
Add Code

Phase-based Information for Voice Pathology Detection

no code implementations • 2 Jan 2020 • Thomas Drugman, Thomas Dubuisson, Thierry Dutoit

In most current approaches of speech processing, information is extracted from the magnitude spectrum.

Paper
Add Code

Chirp Complex Cepstrum-based Decomposition for Asynchronous Glottal Analysis

no code implementations • 10 May 2020 • Thomas Drugman, Thierry Dutoit

It was recently shown that complex cepstrum can be effectively used for glottal flow estimation by separating the causal and anticausal components of speech.

Paper
Add Code

Oscillating Statistical Moments for Speech Polarity Detection

no code implementations • 16 May 2020 • Thomas Drugman, Thierry Dutoit

An inversion of the speech polarity may have a dramatic detrimental effect on the performance of various techniques of speech processing.

Paper
Add Code

Glottal Source Estimation using an Automatic Chirp Decomposition

no code implementations • 16 May 2020 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit

In a previous work, we showed that the glottal source can be estimated from speech signals by computing the Zeros of the Z-Transform (ZZT).

Paper
Add Code

Glottal source estimation robustness: A comparison of sensitivity of voice source estimation techniques

no code implementations • 24 May 2020 • Thomas Drugman, Thomas Dubuisson, Alexis Moinet, Nicolas D'Alessandro, Thierry Dutoit

This paper addresses the problem of estimating the voice source directly from speech waveforms.

Paper
Add Code

Parametric Representation for Singing Voice Synthesis: a Comparative Evaluation

no code implementations • 7 Jun 2020 • Onur Babacan, Thomas Drugman, Tuomo Raitio, Daniel Erro, Thierry Dutoit

Various parametric representations have been proposed to model the speech signal.

Singing Voice Synthesis

Paper
Add Code

Analysis and Synthesis of Hypo and Hyperarticulated Speech

no code implementations • 7 Jun 2020 • Benjamin Picart, Thomas Drugman, Thierry Dutoit

This paper focuses on the analysis and synthesis of hypo and hyperarticulated speech in the framework of HMM-based speech synthesis.

Speech Synthesis

Paper
Add Code

Laughter Synthesis: Combining Seq2seq modeling with Transfer Learning

1 code implementation • 20 Aug 2020 • Noé Tits, Kevin El Haddad, Thierry Dutoit

Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area.

Expressive Speech Synthesis Transfer Learning

Paper
Code

ICE-Talk: an Interface for a Controllable Expressive Talking Machine

1 code implementation • 25 Aug 2020 • Noé Tits, Kevin El Haddad, Thierry Dutoit

ICE-Talk is an open source web-based GUI that allows the use of a TTS system with controllable parameters via a text field and a clickable 2D plot.

Paper
Code

Analysis and Assessment of Controllability of an Expressive Deep Learning-based TTS system

1 code implementation • 6 Mar 2021 • Noé Tits, Kevin El Haddad, Thierry Dutoit

In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control.

Continuous Control

Paper
Code

Where Is My Mind (looking at)? Predicting Visual Attention from Brain Activity

no code implementations • 11 Jan 2022 • Victor Delvigne, Noé Tits, Luca La Fisca, Nathan Hubens, Antoine Maiorca, Hazem Wannous, Thierry Dutoit, Jean-Philippe Vandeborre

The codes and dataset considered in this paper have been made available at \url{https://figshare. com/s/3e353bd1c621962888ad} to promote research in the field.

EEG

Paper
Add Code

A Saliency based Feature Fusion Model for EEG Emotion Estimation

1 code implementation • 11 Jan 2022 • Victor Delvigne, Antoine Facchini, Hazem Wannous, Thierry Dutoit, Laurence Ris, Jean-Philippe Vandeborre

Among the different modalities to assess emotion, electroencephalogram (EEG), representing the electrical brain activity, achieved motivating results over the last decade.

EEG

Paper
Code

Towards Lightweight Neural Animation : Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models

no code implementations • 11 Jan 2022 • Antoine Maiorca, Nathan Hubens, Sohaib Laraba, Thierry Dutoit

Their motion is synthesized by a neural network.

Network Pruning

Paper
Add Code

Spatio-Temporal Analysis of Transformer based Architecture for Attention Estimation from EEG

no code implementations • 4 Apr 2022 • Victor Delvigne, Hazem Wannous, Jean-Philippe Vandeborre, Laurence Ris, Thierry Dutoit

For many years now, understanding the brain mechanism has been a great research subject in many different fields.

EEG Machine Translation

Paper
Add Code

Evaluating the Quality of a Synthesized Motion with the Fréchet Motion Distance

1 code implementation • 26 Apr 2022 • Antoine Maiorca, Youngwoo Yoon, Thierry Dutoit

Evaluating the Quality of a Synthesized Motion with the Fr\'echet Motion Distance

Paper
Code

Analysis of Co-Laughter Gesture Relationship on RGB videos in Dyadic Conversation Contex

no code implementations • 20 May 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.

Motion Synthesis

Paper
Add Code

Transformers and CNNs both Beat Humans on SBIR

no code implementations • 14 Sep 2022 • Omar Seddati, Stéphane Dupont, Saïd Mahmoudi, Thierry Dutoit

Sketch-based image retrieval (SBIR) is the task of retrieving natural images (photos) that match the semantics and the spatial configuration of hand-drawn sketch queries.

Retrieval Sketch-Based Image Retrieval

Paper
Add Code

Cardiotocography Signal Abnormality Detection based on Deep Unsupervised Models

no code implementations • 29 Sep 2022 • Julien Bertieaux, Mohammadhadi Shateri, Fabrice Labeau, Thierry Dutoit

The GANomaly framework, modified to capture the underlying distribution of data samples, is used as our main model and is applied to the CTU-UHB dataset.

Anomaly Detection

Paper
Add Code

Synthesizer Preset Interpolation using Transformer Auto-Encoders

1 code implementation • 27 Oct 2022 • Gwendal Le Vaillant, Thierry Dutoit

Sound synthesizers are widespread in modern music production but they increasingly require expert skills to be mastered.

Paper
Code

Deep learning-based stereo camera multi-video synchronization

1 code implementation • 22 Mar 2023 • Nicolas Boizard, Kevin El Haddad, Thierry Ravet, François Cresson, Thierry Dutoit

Stereo vision is essential for many applications.

Video Synchronization

Paper
Code

A Recipe for Efficient SBIR Models: Combining Relative Triplet Loss with Batch Normalization and Knowledge Distillation

no code implementations • 30 May 2023 • Omar Seddati, Nathan Hubens, Stéphane Dupont, Thierry Dutoit

Then, we introduce a Relative Triplet Loss (RTL), an adapted triplet loss to overcome those limitations through loss weighting based on anchors similarity.

Data Augmentation Knowledge Distillation +2

Paper
Add Code

A New Perspective on Smiling and Laughter Detection: Intensity Levels Matter

no code implementations • 4 Mar 2024 • Hugo Bohy, Kevin El Haddad, Thierry Dutoit

We also present an in-depth analysis of the behavior of these models on the smiles and laughs intensity levels.

Transfer Learning

Paper
Add Code

Analysis of Co-Laughter Gesture Relationship on RGB Videos in Dyadic Conversation Context

no code implementations • SmiLa (LREC) 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.

Motion Synthesis

Paper
Add Code

Are There Any Body-movement Differences between Women and Men When They Laugh?

no code implementations • SmiLa (LREC) 2022 • Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit

Women smile more than men although the expressiveness of women is not universally more across all facial actions.

Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.