Polyphone disambiguation

3 papers with code • 1 benchmarks • 1 datasets

A part of the TTS-front end framework which serves to predict the correct pronunciation for the input polyphone characters.

Datasets


Latest papers with no code

Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

no code yet • 14 Apr 2024

Furthermore, a parallelized TTS frontend model is delicately devised to execute TN, PD, and PBP prediction tasks, respectively in the second stage.

External Knowledge Augmented Polyphone Disambiguation Using Large Language Model

no code yet • 19 Dec 2023

One of the key issues in Mandarin Chinese text-to-speech (TTS) systems is polyphone disambiguation when doing grapheme-to-phoneme (G2P) conversion.

Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation

no code yet • 17 Nov 2022

In this paper we propose a simple back-translation-style data augmentation method for mandarin Chinese polyphone disambiguation, utilizing a large amount of unlabeled text data.

A Polyphone BERT for Polyphone Disambiguation in Mandarin Chinese

no code yet • 1 Jul 2022

Grapheme-to-phoneme (G2P) conversion is an indispensable part of the Chinese Mandarin text-to-speech (TTS) system, and the core of G2P conversion is to solve the problem of polyphone disambiguation, which is to pick up the correct pronunciation for several candidates for a Chinese polyphonic character.

Polyphone disambiguation and accent prediction using pre-trained language models in Japanese TTS front-end

no code yet • 24 Jan 2022

Although end-to-end text-to-speech (TTS) models can generate natural speech, challenges still remain when it comes to estimating sentence-level phonetic and prosodic information from raw text in Japanese TTS systems.

Improving Prosody for Unseen Texts in Speech Synthesis by Utilizing Linguistic Information and Noisy Data

no code yet • 15 Nov 2021

Recent advancements in end-to-end speech synthesis have made it possible to generate highly natural speech.

Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning

no code yet • 1 Feb 2021

In this paper, we propose a novel semi-supervised learning (SSL) framework for Mandarin Chinese polyphone disambiguation that can potentially leverage unlimited unlabeled text data.

A Mask-based Model for Mandarin Chinese Polyphone Disambiguation

no code yet • 21 Oct 2020

Moreover, to mitigate the uneven distribution of pronunciation, we introduce a new loss called Modified Focal Loss.

A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis

no code yet • 11 Nov 2019

In Mandarin text-to-speech (TTS) system, the front-end text processing module significantly influences the intelligibility and naturalness of synthesized speech.

Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features

no code yet • 3 Jul 2019

This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation.