Search Results for author: Antonio Bonafonte

Found 13 papers, 7 papers with code

Distribution augmentation for low-resource expressive text-to-speech

no code implementations13 Feb 2022 Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova

This paper presents a novel data augmentation technique for text-to-speech (TTS), that allows to generate new (text, audio) training examples without requiring any additional data.

Data Augmentation

Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech

no code implementations24 Oct 2021 Marek Strelec, Jonas Rohnke, Antonio Bonafonte, Mateusz Łajszczak, Trevor Wood

We present a Split Vector Quantized Variational Autoencoder (SVQ-VAE) architecture using a split vector quantizer for NTTS, as an enhancement to the well-known Variational Autoencoder (VAE) and Vector Quantized Variational Autoencoder (VQ-VAE) architectures.

Prosodic Phrase Alignment for Machine Dubbing

1 code implementation20 Aug 2019 Alp Öktem, Mireia Farrús, Antonio Bonafonte

Dubbing is a type of audiovisual translation where dialogues are translated and enacted so that they give the impression that the media is in the target language.

Machine Translation Translation

Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNN

2 code implementations3 Jun 2019 David Álvarez, Santiago Pascual, Antonio Bonafonte

This way we feed the acoustic model with speaker acoustically dependent representations that enrich the waveform generation more than discrete embeddings unrelated to these factors.

Sound Audio and Speech Processing

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

1 code implementation6 Apr 2019 Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure.

Distant Speech Recognition

Towards Generalized Speech Enhancement with Generative Adversarial Networks

no code implementations6 Apr 2019 Santiago Pascual, Joan Serrà, Antonio Bonafonte

The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances, affecting their intelligibility.

Speech Enhancement

Self-Attention Linguistic-Acoustic Decoder

no code implementations31 Aug 2018 Santiago Pascual, Antonio Bonafonte, Joan Serrà

The conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models like recurrent neural networks.

Speech Synthesis

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks

3 code implementations31 Aug 2018 Santiago Pascual, Antonio Bonafonte, Joan Serrà, Jose A. Gonzalez

Most methods of voice restoration for patients suffering from aphonia either produce whispered or monotone speech.

Speech Enhancement

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

3 code implementations18 Dec 2017 Santiago Pascual, Maruchan Park, Joan Serrà, Antonio Bonafonte, Kang-Hun Ahn

In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data.

Speech Enhancement

SEGAN: Speech Enhancement Generative Adversarial Network

20 code implementations28 Mar 2017 Santiago Pascual, Antonio Bonafonte, Joan Serrà

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

Speech Enhancement

Building Synthetic Voices in the META-NET Framework

no code implementations LREC 2012 Em{\'\i}lia Garcia Casademont, Antonio Bonafonte, Asunci{\'o}n Moreno

It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META, which is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society.

Speech Synthesis Voice Conversion

BUCEADOR, a multi-language search engine for digital libraries

no code implementations LREC 2012 Jordi Adell, Antonio Bonafonte, Antonio Cardenal, Marta R. Costa-juss{\`a}, Jos{\'e} A. R. Fonollosa, Asunci{\'o}n Moreno, Eva Navas, Eduardo R. Banga

The paper presents the tool functionality, the architecture, the digital library and provide some information about the technology involved in the fields of automatic speech recognition, statistical machine translation, text-to-speech synthesis and information retrieval.

Automatic Speech Recognition Information Retrieval +6

Cannot find the paper you are looking for? You can Submit a new open access paper.