Search Results for author: Thomas Hueber

Found 11 papers, 1 papers with code

Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding

no code implementations • 14 Jun 2023 • Sanjana Sankar, Denis Beautemps, Frédéric Elisei, Olivier Perrotin, Thomas Hueber

Along with the release of this dataset, a benchmark will be reported for word-level recognition, a novelty in the automatic recognition of French CS.

Lipreading

Paper
Add Code

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

no code implementations • 4 Jul 2022 • Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

We collect a corpus of utterances containing contrastive focus and we evaluate the accuracy of a BERT model, finetuned to predict quantized acoustic prominence features, on these samples.

Language Modelling Speech Synthesis +1

Paper
Add Code

Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE

no code implementations • 17 Jun 2022 • Marc-Antoine Georges, Jean-Luc Schwartz, Thomas Hueber

The human perception system is often assumed to recruit motor knowledge when processing auditory speech inputs.

Paper
Add Code

Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding

no code implementations • 11 Apr 2022 • Sanjana Sankar, Denis Beautemps, Thomas Hueber

This paper proposes a simple and effective approach for automatic recognition of Cued Speech (CS), a visual communication tool that helps people with hearing impairment to understand spoken language with the help of hand gestures that can uniquely identify the uttered phonemes in complement to lipreading.

Lipreading speech-recognition +1

Paper
Add Code

Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation

no code implementations • 5 Apr 2022 • Marc-Antoine Georges, Julien Diard, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber

We propose a computational model of speech production combining a pre-trained neural articulatory synthesizer able to reproduce complex speech stimuli from a limited set of interpretable articulatory parameters, a DNN-based internal forward model predicting the sensory consequences of articulatory commands, and an internal inverse model based on a recurrent neural network recovering articulatory commands from the acoustic speech input.

Self-Supervised Learning

Paper
Add Code

Learning robust speech representation with an articulatory-regularized variational autoencoder

no code implementations • 7 Apr 2021 • Marc-Antoine Georges, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber

It is increasingly considered that human speech perception and production both rely on articulatory representations.

Denoising Speech Denoising

Paper
Add Code

Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input

no code implementations • 19 Feb 2021 • Brooke Stephenson, Thomas Hueber, Laurent Girin, Laurent Besacier

The prosody of a spoken word is determined by its surrounding context.

Language Modelling Speech Synthesis +1

Paper
Add Code

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS

no code implementations • 4 Sep 2020 • Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

In this paper, we study the behavior of a neural sequence-to-sequence TTS system when used in an incremental mode, i. e. when generating speech output for token n, the system has access to n + k tokens from the text sequence.

Sentence Speech Synthesis +1

Paper
Add Code

Dynamical Variational Autoencoders: A Comprehensive Review

1 code implementation • 28 Aug 2020 • Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda

Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models.

3D Human Dynamics Resynthesis +2

194

Paper
Code

Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models

no code implementations • 11 Jun 2018 • Fanny Roche, Thomas Hueber, Samuel Limier, Laurent Girin

This study investigates the use of non-linear unsupervised dimensionality reduction techniques to compress a music dataset into a low-dimensional representation which can be used in turn for the synthesis of new sounds.

Audio and Speech Processing Sound

Paper
Add Code

Vizart3D : Retour Articulatoire Visuel pour l'Aide \`a la Prononciation (Vizart3D: Visual Articulatory Feedack for Computer-Assisted Pronunciation Training) [in French]

no code implementations • JEPTALNRECITAL 2012 • Thomas Hueber, Atef Ben-Youssef, Pierre Badin, G{\'e}rard Bailly, Fr{\'e}d{\'e}ric Elis{\'e}i

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.