Emotional Speech Synthesis

3 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

ED-TTS: Multi-Scale Emotion Modeling using Cross-Domain Emotion Diarization for Emotional Speech Synthesis

no code yet • 16 Jan 2024

We introduce ED-TTS, a multi-scale emotional speech synthesis model that leverages Speech Emotion Diarization (SED) and Speech Emotion Recognition (SER) to model emotions at different levels.

QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis

no code yet • 14 Mar 2023

Recent expressive text to speech (TTS) models focus on synthesizing emotional speech, but some fine-grained styles such as intonation are neglected.

Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations

no code yet • 11 Nov 2022

However, the emotional latent space generated from the existing models is difficult to control the continuous emotional intensity because of the entanglement of features like emotions, speakers, etc.

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

no code yet • 28 Oct 2022

From these features, the proposed periodicity generator produces a sample-level sinusoidal source that enables the waveform decoder to accurately reproduce the pitch.

Speech Synthesis with Mixed Emotions

no code yet • 11 Aug 2022

We then incorporate our formulation into a sequence-to-sequence emotional text-to-speech framework.

GANtron: Emotional Speech Synthesis with Generative Adversarial Networks

no code yet • 6 Oct 2021

Speech synthesis is used in a wide variety of industries.

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

no code yet • 17 Jun 2021

Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.

Sentiment Analysis for Emotional Speech Synthesis in a News Dialogue System

no code yet • COLING 2020

As smart speakers and conversational robots become ubiquitous, the demand for expressive speech synthesis has increased.

Multi-stream Attention-based BLSTM with Feature Segmentation for Speech Emotion Recognition

no code yet • Interspeech 2020

One of the model’s weaknesses is that it cannot consider the statistics of speech features, which are known to be effective for speech emotion recognition.

End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training

no code yet • 26 Jun 2019

Objective and subjective evaluation results show that our model outperforms the conventional Tacotron model for ESS when only 5\% of training data has emotion labels.