no code implementations • 22 Jul 2013 • Nicolas Riche, Matthieu Duvinage, Matei Mancas, Bernard Gosselin, Thierry Dutoit
In this paper, a new framework is proposed to assess models of visual saliency.
no code implementations • LREC 2014 • H{\"u}seyin {\c{C}}akmak, J{\'e}r{\^o}me Urbain, Thierry Dutoit, Jo{\"e}lle Tilmanne
Since the aim is to use this database for HMM-based modeling and synthesis, the amount of collected data from one given subject had to be maximized.
no code implementations • LREC 2016 • Kevin El Haddad, H{\"u}seyin {\c{C}}akmak, St{\'e}phane Dupont, Thierry Dutoit
It has been shown that adding expressivity and emotional expressions to an agent{'}s communication systems would improve the interaction quality between this agent and a human user.
no code implementations • WS 2018 • Noé Tits, Kevin El Haddad, Thierry Dutoit
During the last decade, the applications of signal processing have drastically improved with deep learning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 25 Jun 2018 • Adaeze Adigwe, Noé Tits, Kevin El Haddad, Sarah Ostadabbas, Thierry Dutoit
In this paper, we present a database of emotional speech intended to be open-sourced and used for synthesis and generation purpose.
5 code implementations • Advances in Intelligent Systems and Computing 2019 • Noé Tits, Kevin El Haddad, Thierry Dutoit
During the last few years, spoken language technologies have known a big improvement thanks to Deep Learning.
1 code implementation • 27 Mar 2019 • Noé Tits, Fengna Wang, Kevin El Haddad, Vincent Pagel, Thierry Dutoit
The field of Text-to-Speech has experienced huge improvements last years benefiting from deep learning techniques.
no code implementations • 14 Oct 2019 • Noé Tits, Kevin El Haddad, Thierry Dutoit
Finally, we focus on the last one, with the last techniques modeling Text-to-Speech synthesis as a sequence-to-sequence problem.
no code implementations • 28 Dec 2019 • Thomas Drugman, Thierry Dutoit
This paper proposes a new procedure to detect Glottal Closure and Opening Instants (GCIs and GOIs) directly from speech waveforms.
no code implementations • 28 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit
Techniques based on the mixed-phase decomposition and on a closed-phase inverse filtering process turn out to give the best results on both clean synthetic and real speech signals.
no code implementations • 28 Dec 2019 • Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, Thierry Dutoit
The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA).
no code implementations • 29 Dec 2019 • Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit
For this, we hereby propose an adaptation of the Deterministic plus Stochastic Model (DSM) for the residual.
no code implementations • 29 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit
Homomorphic analysis is a well-known method for the separation of non-linearly combined signals.
no code implementations • 29 Dec 2019 • Thomas Drugman, Thierry Dutoit
The applicability of the DSM in two fields of speech processing is then studied.
no code implementations • 30 Dec 2019 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit
Via a systematic study of the windowing effects on the deconvolution quality, we show that the complex cepstrum causal-anticausal decomposition can be effectively used for glottal flow estimation when specific windowing criteria are met.
no code implementations • 30 Dec 2019 • Thomas Drugman, Alexis Moinet, Thierry Dutoit, Geoffrey Wilfart
The source signal is obtained by concatenating excitation frames picked up from the codebook, based on a selection criterion and taking target residual coefficients as input.
no code implementations • 2 Jan 2020 • Thomas Drugman, Thomas Dubuisson, Thierry Dutoit
This paper addresses the problem of automatic detection of voice pathologies directly from the speech signal.
no code implementations • 2 Jan 2020 • Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit
Statistical parametric speech synthesizers have recently shown their ability to produce natural-sounding and flexible voices.
no code implementations • 2 Jan 2020 • Thomas Drugman, Thierry Dutoit
This paper addresses the problem of pitch modification, as an important module for an efficient voice transformation system.
no code implementations • 2 Jan 2020 • Thomas Drugman, Thierry Dutoit, Baris Bozkurt
This paper investigates the differences occuring in the excitation for different voice qualities.
no code implementations • 2 Jan 2020 • Thomas Drugman, Thomas Dubuisson, Thierry Dutoit
In most current approaches of speech processing, information is extracted from the magnitude spectrum.
no code implementations • 10 May 2020 • Thomas Drugman, Thierry Dutoit
It was recently shown that complex cepstrum can be effectively used for glottal flow estimation by separating the causal and anticausal components of speech.
no code implementations • 16 May 2020 • Thomas Drugman, Baris Bozkurt, Thierry Dutoit
In a previous work, we showed that the glottal source can be estimated from speech signals by computing the Zeros of the Z-Transform (ZZT).
no code implementations • 16 May 2020 • Thomas Drugman, Thierry Dutoit
An inversion of the speech polarity may have a dramatic detrimental effect on the performance of various techniques of speech processing.
no code implementations • 24 May 2020 • Thomas Drugman, Thomas Dubuisson, Alexis Moinet, Nicolas D'Alessandro, Thierry Dutoit
This paper addresses the problem of estimating the voice source directly from speech waveforms.
no code implementations • 7 Jun 2020 • Benjamin Picart, Thomas Drugman, Thierry Dutoit
This paper focuses on the analysis and synthesis of hypo and hyperarticulated speech in the framework of HMM-based speech synthesis.
no code implementations • 7 Jun 2020 • Onur Babacan, Thomas Drugman, Tuomo Raitio, Daniel Erro, Thierry Dutoit
Various parametric representations have been proposed to model the speech signal.
1 code implementation • 20 Aug 2020 • Noé Tits, Kevin El Haddad, Thierry Dutoit
Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area.
1 code implementation • 25 Aug 2020 • Noé Tits, Kevin El Haddad, Thierry Dutoit
ICE-Talk is an open source web-based GUI that allows the use of a TTS system with controllable parameters via a text field and a clickable 2D plot.
1 code implementation • 6 Mar 2021 • Noé Tits, Kevin El Haddad, Thierry Dutoit
In this paper, we study the controllability of an Expressive TTS system trained on a dataset for a continuous control.
no code implementations • 11 Jan 2022 • Victor Delvigne, Noé Tits, Luca La Fisca, Nathan Hubens, Antoine Maiorca, Hazem Wannous, Thierry Dutoit, Jean-Philippe Vandeborre
The codes and dataset considered in this paper have been made available at \url{https://figshare. com/s/3e353bd1c621962888ad} to promote research in the field.
1 code implementation • 11 Jan 2022 • Victor Delvigne, Antoine Facchini, Hazem Wannous, Thierry Dutoit, Laurence Ris, Jean-Philippe Vandeborre
Among the different modalities to assess emotion, electroencephalogram (EEG), representing the electrical brain activity, achieved motivating results over the last decade.
no code implementations • 11 Jan 2022 • Antoine Maiorca, Nathan Hubens, Sohaib Laraba, Thierry Dutoit
Their motion is synthesized by a neural network.
no code implementations • 4 Apr 2022 • Victor Delvigne, Hazem Wannous, Jean-Philippe Vandeborre, Laurence Ris, Thierry Dutoit
For many years now, understanding the brain mechanism has been a great research subject in many different fields.
1 code implementation • 26 Apr 2022 • Antoine Maiorca, Youngwoo Yoon, Thierry Dutoit
Evaluating the Quality of a Synthesized Motion with the Fr\'echet Motion Distance
no code implementations • 20 May 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.
no code implementations • 14 Sep 2022 • Omar Seddati, Stéphane Dupont, Saïd Mahmoudi, Thierry Dutoit
Sketch-based image retrieval (SBIR) is the task of retrieving natural images (photos) that match the semantics and the spatial configuration of hand-drawn sketch queries.
no code implementations • 29 Sep 2022 • Julien Bertieaux, Mohammadhadi Shateri, Fabrice Labeau, Thierry Dutoit
The GANomaly framework, modified to capture the underlying distribution of data samples, is used as our main model and is applied to the CTU-UHB dataset.
1 code implementation • 27 Oct 2022 • Gwendal Le Vaillant, Thierry Dutoit
Sound synthesizers are widespread in modern music production but they increasingly require expert skills to be mastered.
1 code implementation • 22 Mar 2023 • Nicolas Boizard, Kevin El Haddad, Thierry Ravet, François Cresson, Thierry Dutoit
Stereo vision is essential for many applications.
no code implementations • 30 May 2023 • Omar Seddati, Nathan Hubens, Stéphane Dupont, Thierry Dutoit
Then, we introduce a Relative Triplet Loss (RTL), an adapted triplet loss to overcome those limitations through loss weighting based on anchors similarity.
no code implementations • 4 Mar 2024 • Hugo Bohy, Kevin El Haddad, Thierry Dutoit
We also present an in-depth analysis of the behavior of these models on the smiles and laughs intensity levels.
no code implementations • 29 Apr 2024 • Antoine Maiorca, Seyed Abolfazl Ghasemzadeh, Thierry Ravet, François Cresson, Thierry Dutoit, Christophe De Vleeschouwer
Nevertheless, the limitations of these systems are multiples: the desynchronization between the two motion sources and occlusions are examples of significant issues that hinder the implementations of such systems.
no code implementations • SmiLa (LREC) 2022 • Hugo Bohy, Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Laughter is not just an audio signal, but an intrinsic relationship of multimodal non-verbal communication, in addition to audio, it includes facial expressions and body movements.
no code implementations • SmiLa (LREC) 2022 • Ahmad Hammoudeh, Antoine Maiorca, Stéphane Dupont, Thierry Dutoit
Women smile more than men although the expressiveness of women is not universally more across all facial actions.