Search Results for author: Nobukatsu Hojo

Found 14 papers, 6 papers with code

MaskCycleGAN-VC: Learning Non-parallel Voice Conversion with Filling in Frames

2 code implementations25 Feb 2021 Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

With FIF, we apply a temporal mask to the input mel-spectrogram and encourage the converter to fill in missing frames based on surrounding frames.

Voice Conversion

Model architectures to extrapolate emotional expressions in DNN-based text-to-speech

no code implementations20 Feb 2021 Katsuki Inoue, Sunao Hara, Masanobu Abe, Nobukatsu Hojo, Yusuke Ijima

In this study, the meaning of "extrapolate emotional expressions" is to borrow emotional expressions from others, and the collection of emotional speech uttered by target speakers is unnecessary.

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-spectrogram Conversion

1 code implementation22 Oct 2020 Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

To address this, we examined the applicability of CycleGAN-VC/VC2 to mel-spectrogram conversion.

Voice Conversion

Nonparallel Voice Conversion with Augmented Classifier Star Generative Adversarial Networks

no code implementations27 Aug 2020 Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo

We previously proposed a method that allows for nonparallel voice conversion (VC) by using a variant of generative adversarial networks (GANs) called StarGAN.

Voice Conversion

Many-to-Many Voice Transformer Network

no code implementations18 May 2020 Hirokazu Kameoka, Wen-Chin Huang, Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Tomoki Toda

The main idea we propose is an extension of the original VTN that can simultaneously learn mappings among multiple speakers.

Voice Conversion

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion

2 code implementations29 Jul 2019 Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

To bridge this gap, we rethink conditional methods of StarGAN-VC, which are key components for achieving non-parallel multi-domain VC in a single model, and propose an improved variant called StarGAN-VC2.

Voice Conversion

CycleGAN-VC2: Improved CycleGAN-based Non-parallel Voice Conversion

2 code implementations9 Apr 2019 Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Nobukatsu Hojo

Non-parallel voice conversion (VC) is a technique for learning the mapping from source to target speech without relying on parallel data.

Voice Conversion

WaveCycleGAN2: Time-domain Neural Post-filter for Speech Waveform Generation

no code implementations5 Apr 2019 Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo

WaveCycleGAN has recently been proposed to bridge the gap between natural and synthesized speech waveforms in statistical parametric speech synthesis and provides fast inference with a moving average model rather than an autoregressive model and high-quality speech synthesis with the adversarial training.

Speech Synthesis

AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms

no code implementations9 Nov 2018 Kou Tanaka, Hirokazu Kameoka, Takuhiro Kaneko, Nobukatsu Hojo

This paper describes a method based on a sequence-to-sequence learning (Seq2Seq) with attention and context preservation mechanism for voice conversion (VC) tasks.

Image Captioning Machine Translation +4

ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion

no code implementations5 Nov 2018 Hirokazu Kameoka, Kou Tanaka, Damian Kwasny, Takuhiro Kaneko, Nobukatsu Hojo

Second, it achieves many-to-many conversion by simultaneously learning mappings among multiple speakers using only a single model instead of separately learning mappings between each speaker pair using a different model.

Speech Enhancement Voice Conversion

WaveCycleGAN: Synthetic-to-natural speech waveform conversion using cycle-consistent adversarial networks

no code implementations25 Sep 2018 Kou Tanaka, Takuhiro Kaneko, Nobukatsu Hojo, Hirokazu Kameoka

The experimental results demonstrate that our proposed method can 1) alleviate the over-smoothing effect of the acoustic features despite the direct modification method used for the waveform and 2) greatly improve the naturalness of the generated speech sounds.

Speech Synthesis Voice Conversion

ACVAE-VC: Non-parallel many-to-many voice conversion with auxiliary classifier variational autoencoder

2 code implementations13 Aug 2018 Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo

Such situations can be avoided by introducing an auxiliary classifier and training the encoder and decoder so that the attribute classes of the decoder outputs are correctly predicted by the classifier.

Voice Conversion

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks

11 code implementations6 Jun 2018 Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo

This paper proposes a method that allows non-parallel many-to-many voice conversion (VC) by using a variant of a generative adversarial network (GAN) called StarGAN.

Voice Conversion

Generative adversarial network-based approach to signal reconstruction from magnitude spectrograms

no code implementations6 Apr 2018 Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, Nobukatsu Hojo, Hiroyasu Ando

In this paper, we address the problem of reconstructing a time-domain signal (or a phase spectrogram) solely from a magnitude spectrogram.

Cannot find the paper you are looking for? You can Submit a new open access paper.