Search Results for author: Hieu-Thi Luong

Found 12 papers, 0 papers with code

LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example

no code implementations • 11 Oct 2021 • Hieu-Thi Luong, Junichi Yamagishi

Emotional and controllable speech synthesis is a topic that has received much attention.

Paper
Add Code

Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance

no code implementations • 25 Jun 2021 • Hieu-Thi Luong, Junichi Yamagishi

Generally speaking, the main objective when training a neural speech synthesis system is to synthesize natural and expressive speech from the output layer of the neural network without much attention given to the hidden layers.

Quantization Speech Synthesis +1

Paper
Add Code

Latent linguistic embedding for cross-lingual text-to-speech and voice conversion

no code implementations • 8 Oct 2020 • Hieu-Thi Luong, Junichi Yamagishi

As the recently proposed voice cloning system, NAUTILUS, is capable of cloning unseen voices using untranscribed speech, we investigate the feasibility of using it to develop a unified cross-lingual TTS/VC system.

Voice Cloning Voice Conversion

Paper
Add Code

NAUTILUS: a Versatile Voice Cloning System

no code implementations • 22 May 2020 • Hieu-Thi Luong, Junichi Yamagishi

By using a multi-speaker speech corpus to train all requisite encoders and decoders in the initial training stage, our system can clone unseen voices using untranscribed speech of target speakers on the basis of the backpropagation algorithm.

Speech Synthesis Voice Cloning +1

Paper
Add Code

Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech

no code implementations • 14 Sep 2019 • Hieu-Thi Luong, Junichi Yamagishi

Voice conversion (VC) and text-to-speech (TTS) are two tasks that share a similar objective, generating speech with a target voice.

Voice Conversion

Paper
Add Code

A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation

no code implementations • 18 Jun 2019 • Hieu-Thi Luong, Junichi Yamagishi

In this study, we propose a novel speech synthesis model, which can be adapted to unseen speakers by fine-tuning part of or all of the network using either transcribed or untranscribed speech.

Speech Synthesis

Paper
Add Code

Training Multi-Speaker Neural Text-to-Speech Systems using Speaker-Imbalanced Speech Corpora

no code implementations • 1 Apr 2019 • Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

When the available data of a target speaker is insufficient to train a high quality speaker-dependent neural text-to-speech (TTS) system, we can combine data from multiple speakers and train a multi-speaker TTS model instead.

Paper
Add Code

Multimodal speech synthesis architecture for unsupervised speaker adaptation

no code implementations • 20 Aug 2018 • Hieu-Thi Luong, Junichi Yamagishi

Two new training schemes for the new architecture are also proposed in this paper.

Speech Synthesis

Paper
Add Code

Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects

no code implementations • 2 Aug 2018 • Hieu-Thi Luong, Xin Wang, Junichi Yamagishi, Nobuyuki Nishizawa

We investigated the impact of noisy linguistic features on the performance of a Japanese speech synthesis system based on neural network that uses WaveNet vocoder.

Denoising Speech Synthesis

Paper
Add Code

Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder

no code implementations • 31 Jul 2018 • Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu

In order to reduce the mismatched characteristics between natural and generated acoustic features, we propose frameworks that incorporate either a conditional generative adversarial network (GAN) or its variant, Wasserstein GAN with gradient penalty (WGAN-GP), into multi-speaker speech synthesis that uses the WaveNet vocoder.

Generative Adversarial Network Speech Synthesis +1

Paper
Add Code

Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems

no code implementations • 31 Jul 2018 • Hieu-Thi Luong, Junichi Yamagishi

Most neural-network based speaker-adaptive acoustic models for speech synthesis can be categorized into either layer-based or input-code approaches.

Speech Synthesis

Paper
Add Code

A non-expert Kaldi recipe for Vietnamese Speech Recognition System

no code implementations • WS 2016 • Hieu-Thi Luong, Hai-Quan Vu

In this paper we describe a non-expert setup for Vietnamese speech recognition system using Kaldi toolkit.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.