Search Results for author: Shigeki Karita

Found 11 papers, 3 papers with code

Lenient Evaluation of Japanese Speech Recognition: Modeling Naturally Occurring Spelling Inconsistency

no code implementations7 Jun 2023 Shigeki Karita, Richard Sproat, Haruko Ishikawa

Word error rate (WER) and character error rate (CER) are standard metrics in Speech Recognition (ASR), but one problem has always been alternative spellings: If one's system transcribes adviser whereas the ground truth has advisor, this will count as an error even though the two spellings really represent the same word.

Machine Translation speech-recognition +2

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

no code implementations30 May 2023 Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Michiel Bacchiani, Yu Zhang, Wei Han, Ankur Bapna

The constituent samples of LibriTTS-R are identical to those of LibriTTS, with only the sound quality improved.

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

1 code implementation3 Mar 2023 Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Experiments show that Miipher (i) is robust against various audio degradation and (ii) enable us to train a high-quality text-to-speech (TTS) model from restored speech samples collected from the Web.

Speech Denoising Speech Enhancement

Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers

no code implementations16 Feb 2022 Yotaro Kubo, Shigeki Karita, Michiel Bacchiani

Since embedding vectors can be assumed as implicit representations of linguistic information such as part-of-speech, intent, and so on, those are also expected to be useful modeling cues for ASR decoders.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

SNRi Target Training for Joint Speech Enhancement and Recognition

no code implementations1 Nov 2021 Yuma Koizumi, Shigeki Karita, Arun Narayanan, Sankaran Panchapagesan, Michiel Bacchiani

Furthermore, by analyzing the predicted target SNRi, we observed the jointly trained network automatically controls the target SNRi according to noise characteristics.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition

no code implementations9 Jun 2021 Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones

End-to-end (E2E) modeling is advantageous for automatic speech recognition (ASR) especially for Japanese since word-based tokenization of Japanese is not trivial, and E2E modeling is able to model character sequences directly.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Unsupervised Learning of Disentangled Speech Content and Style Representation

no code implementations24 Oct 2020 Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita

We present an approach for unsupervised learning of speech representation disentangling contents and styles.

Decoder Speaker Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.