Search Results for author: Thomas Hueber

Found 11 papers, 1 papers with code

Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding

no code implementations14 Jun 2023 Sanjana Sankar, Denis Beautemps, Frédéric Elisei, Olivier Perrotin, Thomas Hueber

Along with the release of this dataset, a benchmark will be reported for word-level recognition, a novelty in the automatic recognition of French CS.

Lipreading

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

no code implementations4 Jul 2022 Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

We collect a corpus of utterances containing contrastive focus and we evaluate the accuracy of a BERT model, finetuned to predict quantized acoustic prominence features, on these samples.

Language Modelling Speech Synthesis +1

Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE

no code implementations17 Jun 2022 Marc-Antoine Georges, Jean-Luc Schwartz, Thomas Hueber

The human perception system is often assumed to recruit motor knowledge when processing auditory speech inputs.

Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding

no code implementations11 Apr 2022 Sanjana Sankar, Denis Beautemps, Thomas Hueber

This paper proposes a simple and effective approach for automatic recognition of Cued Speech (CS), a visual communication tool that helps people with hearing impairment to understand spoken language with the help of hand gestures that can uniquely identify the uttered phonemes in complement to lipreading.

Lipreading speech-recognition +1

Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation

no code implementations5 Apr 2022 Marc-Antoine Georges, Julien Diard, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber

We propose a computational model of speech production combining a pre-trained neural articulatory synthesizer able to reproduce complex speech stimuli from a limited set of interpretable articulatory parameters, a DNN-based internal forward model predicting the sensory consequences of articulatory commands, and an internal inverse model based on a recurrent neural network recovering articulatory commands from the acoustic speech input.

Self-Supervised Learning

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS

no code implementations4 Sep 2020 Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber

In this paper, we study the behavior of a neural sequence-to-sequence TTS system when used in an incremental mode, i. e. when generating speech output for token n, the system has access to n + k tokens from the text sequence.

Sentence Speech Synthesis +1

Dynamical Variational Autoencoders: A Comprehensive Review

1 code implementation28 Aug 2020 Laurent Girin, Simon Leglaive, Xiaoyu Bie, Julien Diard, Thomas Hueber, Xavier Alameda-Pineda

Recently, a series of papers have presented different extensions of the VAE to process sequential data, which model not only the latent space but also the temporal dependencies within a sequence of data vectors and corresponding latent vectors, relying on recurrent neural networks or state-space models.

3D Human Dynamics Resynthesis +2

Autoencoders for music sound modeling: a comparison of linear, shallow, deep, recurrent and variational models

no code implementations11 Jun 2018 Fanny Roche, Thomas Hueber, Samuel Limier, Laurent Girin

This study investigates the use of non-linear unsupervised dimensionality reduction techniques to compress a music dataset into a low-dimensional representation which can be used in turn for the synthesis of new sounds.

Audio and Speech Processing Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.