Search Results for author: Tomoki Koriyama

Found 8 papers, 0 papers with code

Frame-Wise Breath Detection with Self-Training: An Exploration of Enhancing Breath Naturalness in Text-to-Speech

no code implementations1 Feb 2024 Dong Yang, Tomoki Koriyama, Yuki Saito

Developing Text-to-Speech (TTS) systems that can synthesize natural breath is essential for human-like voice agents but requires extensive manual annotation of breath positions in training data.

Structured State Space Decoder for Speech Recognition and Synthesis

no code implementations31 Oct 2022 Koichi Miyazaki, Masato Murata, Tomoki Koriyama

Automatic speech recognition (ASR) systems developed in recent years have shown promising results with self-attention models (e. g., Transformer and Conformer), which are replacing conventional recurrent neural networks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes

no code implementations7 Aug 2020 Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari

We propose a framework for multi-speaker speech synthesis using deep Gaussian processes (DGPs); a DGP is a deep architecture of Bayesian kernel regressions and thus robust to overfitting.

Gaussian Processes Speech Synthesis +1

Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit

no code implementations22 Apr 2020 Tomoki Koriyama, Hiroshi Saruwatari

This paper presents a deep Gaussian process (DGP) model with a recurrent architecture for speech sequence modeling.

Speech Synthesis

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking

no code implementations9 Feb 2019 Hiroki Tamaru, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari

To address this problem, we use a GMMN to model the variation of the modulation spectrum of the pitch contour of natural singing voices and add a randomized inter-utterance variation to the pitch contour generated by conventional DNN-based singing voice synthesis.

Singing Voice Synthesis

Sampling-based speech parameter generation using moment-matching networks

no code implementations12 Apr 2017 Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari

To give synthetic speech natural inter-utterance variation, this paper builds DNN acoustic models that make it possible to randomly sample speech parameters.

Speech Synthesis

Cannot find the paper you are looking for? You can Submit a new open access paper.