Search Results for author: Takaaki Saeki

Found 9 papers, 5 papers with code

SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech Recognition

1 code implementation • LREC 2022 • Naoki Kimura, Zixiong Su, Takaaki Saeki, Jun Rekimoto

Although neural end-to-end models are successfully updating the state-of-the-art technology in the field of automatic speech recognition, SSR research based on ultrasound tongue imaging has still not evolved past cascaded DNN-HMM models due to the absence of a large dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

no code implementations • 29 Feb 2024 • Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov

Without any transcribed speech in a new language, this TTS model can generate intelligible speech in >30 unseen languages (CER difference of <10% to ground truth).

Representation Learning Speech Synthesis

Paper
Add Code

Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech

no code implementations • 27 Feb 2023 • Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari

We also leverage duration-aware pause insertion for more natural multi-speaker TTS.

Language Modelling

Paper
Add Code

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining

1 code implementation • 30 Jan 2023 • Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari

While neural text-to-speech (TTS) has achieved human-like natural synthetic speech, multilingual TTS systems are limited to resource-rich languages due to the need for paired text and studio-quality audio data.

Language Modelling

Paper
Code

SpeechLMScore: Evaluating speech generation using speech language model

2 code implementations • 8 Dec 2022 • Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe

While human evaluation is the most reliable metric for evaluating speech generation systems, it is generally costly and time-consuming.

Language Modelling Speech Enhancement +1

7,867

Paper
Code

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

no code implementations • 27 Oct 2022 • Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

This paper proposes Virtuoso, a massively multilingual speech-text joint semi-supervised learning framework for text-to-speech synthesis (TTS) models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis

1 code implementation • 14 Oct 2022 • Yuta Matsunaga, Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari

We present a comprehensive empirical study for personalized spontaneous speech synthesis on the basis of linguistic knowledge.

Speech Synthesis Voice Cloning

Paper
Code

ESPnet2-TTS: Extending the Edge of TTS Research

1 code implementation • 15 Oct 2021 • Tomoki Hayashi, Ryuichi Yamamoto, Takenori Yoshimura, Peter Wu, Jiatong Shi, Takaaki Saeki, Yooncheol Ju, Yusuke Yasuda, Shinnosuke Takamichi, Shinji Watanabe

This paper describes ESPnet2-TTS, an end-to-end text-to-speech (E2E-TTS) toolkit.

7,867

Paper
Code

Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network

no code implementations • 22 Sep 2021 • Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari

Although this method achieves comparable speech quality to that of a method that waits for the future context, it entails a huge amount of processing for sampling from the language model at each time step.

Knowledge Distillation Language Modelling +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.