Search Results for author: Seung-bin Kim

Found 10 papers, 6 papers with code

JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis

no code implementations9 Jan 2025 Jun-Hyeok Cha, Seung-bin Kim, Hyung-Seok Oh, Seong-Whan Lee

To address this, we introduce JELLY, a novel CSS framework that integrates emotion recognition and context reasoning for generating appropriate speech in conversation by fine-tuning a large language model (LLM) with multiple partial LoRA modules.

Emotion Recognition Language Modeling +3

EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector

1 code implementation4 Nov 2024 Deok-Hyeon Cho, Hyung-Seok Oh, Seung-bin Kim, Seong-Whan Lee

Emotional text-to-speech (TTS) technology has achieved significant progress in recent years; however, challenges remain owing to the inherent complexity of emotions and limitations of the available emotional speech datasets and models.

Decoder Emotional Speech Synthesis +1

EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech

1 code implementation12 Jun 2024 Deok-Hyeon Cho, Hyung-Seok Oh, Seung-bin Kim, Sang-Hoon Lee, Seong-Whan Lee

Despite rapid advances in the field of emotional text-to-speech (TTS), recent studies primarily focus on mimicking the average style of a particular emotion.

Emotional Speech Synthesis Text to Speech +1

MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms

1 code implementation11 Jun 2024 Seung-bin Kim, Chan-yeong Lim, Jungwoo Heo, Ju-ho Kim, Hyun-seo Shin, Kyo-Won Koo, Ha-Jin Yu

In speaker verification systems, the utilization of short utterances presents a persistent challenge, leading to performance degradation primarily due to insufficient phonetic information to characterize the speakers.

Speaker Verification

TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data

no code implementations17 Jan 2024 Seung-bin Kim, Sang-Hoon Lee, Seong-Whan Lee

With this method, despite training exclusively on the target language's monolingual data, we can generate target language speech in the inference stage using language-agnostic speech embedding from the source language speech.

Sentence Speech-to-Speech Translation +1

Integrated Replay Spoofing-aware Text-independent Speaker Verification

no code implementations10 Jun 2020 Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Seung-bin Kim, Ha-Jin Yu

In this paper, we propose two approaches for building an integrated system of speaker verification and presentation attack detection: an end-to-end monolithic approach and a back-end modular approach.

Multi-Task Learning Speaker Identification +1

Segment Aggregation for short utterances speaker verification using raw waveforms

1 code implementation7 May 2020 Seung-bin Kim, Jee-weon Jung, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

The proposed method segments an input utterance into several short utterances and then aggregates the segment embeddings extracted from the segmented inputs to compose a speaker embedding.

Speaker Verification

Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms

2 code implementations1 Apr 2020 Jee-weon Jung, Seung-bin Kim, Hye-jin Shim, Ju-ho Kim, Ha-Jin Yu

Recent advances in deep learning have facilitated the design of speaker verification systems that directly input raw waveforms.

Text-Independent Speaker Verification

Cannot find the paper you are looking for? You can Submit a new open access paper.