Prosody Prediction

2 papers with code • 1 benchmarks • 2 datasets

Predicting prosodic prominence from text. This is a 2-way classification task, assigning each word in a sentence a label 1 (prominent) or 0 (non-prominent).

( Image credit: Helsinki Prosody Corpus )

Latest papers with no code

Prosody Analysis of Audiobooks

no code yet • 10 Oct 2023

Recent advances in text-to-speech have made it possible to generate natural-sounding audio from text.

A Comparative Analysis of Pretrained Language Models for Text-to-Speech

no code yet • 4 Sep 2023

In this study, we aim to address this gap by conducting a comparative analysis of different PLMs for two TTS tasks: prosody prediction and pause prediction.

Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data

no code yet • 29 Jun 2023

We propose a method for speech-to-speech emotionpreserving translation that operates at the level of discrete speech units.

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model

no code yet • 10 Jun 2023

This study is focused on understanding and quantifying the change in phoneme and prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an accent identification (AID) fine-tuning task.

Ensemble prosody prediction for expressive speech synthesis

no code yet • 3 Apr 2023

Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech.

Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis

no code yet • 14 Mar 2023

Cross-speaker style transfer in speech synthesis aims at transferring a style from source speaker to synthesized speech of a target speaker's timbre.

Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit

no code yet • 13 Aug 2020

Recent neural speech synthesis systems have gradually focused on the control of prosody to improve the quality of synthesized speech, but they rarely consider the variability of prosody and the correlation between prosody and semantics together.

Controllable Sequence-To-Sequence Neural TTS with LPCNET Backend for Real-time Speech Synthesis on CPU

no code yet • 25 Feb 2020

State-of-the-art sequence-to-sequence acoustic networks, that convert a phonetic sequence to a sequence of spectral features with no explicit prosody prediction, generate speech with close to natural quality, when cascaded with neural vocoders, such as Wavenet.

Automatic Prosody Prediction for Chinese Speech Synthesis using BLSTM-RNN and Embedding Features

no code yet • 2 Nov 2015

Prosody affects the naturalness and intelligibility of speech.