1 code implementation • 6 Jun 2024 • Théodor Lemerle, Nicolas Obin, Axel Roebel
Recent advancements in text-to-speech (TTS) powered by language models have showcased remarkable capabilities in achieving naturalness and zero-shot voice cloning.
no code implementations • 5 Oct 2023 • Frederik Bous, Axel Roebel
The information bottleneck auto-encoder is a tool for disentanglement commonly used for voice transformation.
no code implementations • 20 Sep 2023 • Giovanni Bindi, Nils Demerlé, Rodrigo Diaz, David Genova, Aliénor Golvet, Ben Hayes, Jiawen Huang, Lele Liu, Vincent Martos, Sarah Nabi, Teresa Pelinski, Lenny Renault, Saurjya Sarkar, Pedro Sarmento, Cyrus Vahidi, Lewis Wolstanholme, Yixiao Zhang, Axel Roebel, Nick Bryan-Kinns, Jean-Louis Giavitto, Mathieu Barthet
The students represent the future generation of AI and music researchers.
no code implementations • 8 Apr 2022 • Frederik Bous, Axel Roebel
We introduce the recording factor that relates the voice level to the recorded signal power as a proportionality constant.
no code implementations • 11 Feb 2022 • Daniel Wolff, Rémi Mignot, Axel Roebel
The ability of our models to capture variance is explored in a detector for artefacts from decompression of corrupted MP3 compressed audio.
no code implementations • 7 Oct 2021 • Axel Roebel, Frederik Bous
This paper introduces the Multi-Band Excited WaveNet a neural vocoder for speaking and singing voices.
no code implementations • 26 Jul 2021 • Laurent Benaroya, Nicolas Obin, Axel Roebel
Voice conversion (VC) consists of digitally altering the voice of an individual to manipulate part of its content, primarily its identity, while maintaining the rest unchanged.
no code implementations • 15 Apr 2021 • Clément Le Moine Veillon, Nicolas Obin, Axel Roebel
A single neural network is proposed, in which a first module is used to learn F0 representation over different temporal scales and a second adversarial module is used to learn the transformation from one emotion to another.
no code implementations • 15 Apr 2021 • Clément Le Moine, Nicolas Obin, Axel Roebel
Speech Emotion Recognition (SER) task has known significant improvements over the last years with the advent of Deep Neural Networks (DNNs).
no code implementations • JEPTALNRECITAL 2020 • Mathias Quillot, Lauriane Guillou, Adrien Gresse, Rafa{\"e}l Ferro, Rapha{\"e}l R{\"o}th, Damien Malinas, Richard Dufour, Axel Roebel, Nicolas Obin, Jean-Fran{\c{c}}ois Bonastre, Emmanuel Ethis
La voix act{\'e}e repr{\'e}sente un d{\'e}fi majeur pour les futures interfaces vocales avec un potentiel d{'}application extr{\^e}mement important pour la transformation num{\'e}rique des secteurs de la culture et de la communication, comme la production ou la post-production de voix pour les s{\'e}ries ou le cin{\'e}ma.
1 code implementation • 22 Oct 2019 • Luc Ardaillon, Axel Roebel
Following this trend, we propose here a simple approach that performs a regression from the speech waveform to a target signal from which the GCI are easily obtained by peak-picking.
no code implementations • 22 Oct 2019 • Rafael Ferro, Nicolas Obin, Axel Roebel
This paper tackles GAN optimization and stability issues in the context of voice conversion.
no code implementations • 31 Jan 2015 • Mathieu Lagrange, Grégoire Lafay, Mathias Rossignol, Emmanouil Benetos, Axel Roebel
This paper introduces a model of environmental acoustic scenes which adopts a morphological approach by ab-stracting temporal structures of acoustic scenes.