Multi-speaker Emotional Text-to-speech Synthesizer

no code implementations7 Dec 2021 Sungjae Cho, Soo-Young Lee

We present a methodology to train our multi-speaker emotional text-to-speech synthesizer that can express speech for 10 speakers' 7 different emotions.

Unigram-Normalized Perplexity as a Language Model Performance Measure with Different Vocabulary Sizes

no code implementations26 Nov 2020 Jihyeon Roh, Sang-Hoon Oh, Soo-Young Lee

Although Perplexity is a widely used performance metric for language models, the values are highly dependent upon the number of words in the corpus and is useful to compare performance of the same corpus only.

Language Modelling

Hierarchical GPT with Congruent Transformers for Multi-Sentence Language Models

no code implementations18 Sep 2020 Jihyeon Roh, Huiseong Gim, Soo-Young Lee

First, we propose a hierarchical GPT which consists of three blocks, i. e., a sentence encoding block, a sentence generating block, and a sentence decoding block.

Dialogue Generation Language Modelling +2

Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition

1 code implementation6 Nov 2018 Geonmin Kim, Hwaran Lee, Bo-Kyeong Kim, Sang-Hoon Oh, Soo-Young Lee

Many speech enhancement methods try to learn the relationship between noisy and clean speech, obtained using an acoustic room simulator.

Speech Enhancement speech-recognition +1

A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

1 code implementation12 Oct 2018 Azam Rabiee, Soo-Young Lee

This paper introduces a deep neural network model for subband-based speech synthesizer.

End-to-end Multimodal Emotion and Gender Recognition with Dynamic Joint Loss Weights

1 code implementation4 Sep 2018 Myungsu Chae, Tae-Ho Kim, Young Hoon Shin, June-Woo Kim, Soo-Young Lee

In our experiments, emotion and gender recognition with the proposed method yielded a lower joint loss, which is computed as the negative log-likelihood, than using static weights for joint loss.

Multi-Task Learning

Voice Imitating Text-to-Speech Neural Networks

no code implementations journal 2018 Young-Gun Lee, Taesu Kim, Soo-Young Lee

We propose a neural text-to-speech (TTS) model that can imitate a new speaker's voice using only a small amount of speech sample.

Emotional End-to-End Neural Speech Synthesizer

1 code implementation15 Nov 2017 Young-Gun Lee, Azam Rabiee, Soo-Young Lee

In this paper, we introduce an emotional speech synthesizer based on the recent end-to-end neural model, named Tacotron.

Deep CNNs along the Time Axis with Intermap Pooling for Robustness to Spectral Variations

no code implementations10 Jun 2016 Hwaran Lee, Geonmin Kim, Ho-Gyeong Kim, Sang-Hoon Oh, Soo-Young Lee

Convolutional neural networks (CNNs) with convolutional and pooling operations along the frequency axis have been proposed to attain invariance to frequency shifts of features.

Compositional Sentence Representation from Character within Large Context Text

no code implementations2 May 2016 Geonmin Kim, Hwaran Lee, Jisu Choi, Soo-Young Lee

In the HCRN, word representations are built from characters, thus resolving the data-sparsity problem, and inter-sentence dependency is embedded into sentence representation at the level of sentence composition.

Dialogue Act Classification General Classification

