no code implementations • 1 May 2024 • Yoori Oh, Yoseob Han, Kyogu Lee
To overcome the limitation, we introduce a method that employs a distance sampling-based paraphraser leveraging ChatGPT, utilizing distance function to generate a controllable distribution of manipulated text data.
no code implementations • 11 Nov 2022 • Yoori Oh, Juheon Lee, Yoseob Han, Kyogu Lee
However, the emotional latent space generated from the existing models is difficult to control the continuous emotional intensity because of the entanglement of features like emotions, speakers, etc.
no code implementations • 31 Oct 2022 • Eungbeom Kim, Jinhee Kim, Yoori Oh, KyungSu Kim, Minju Park, Jaeheon Sim, Jinwoo Lee, Kyogu Lee
In this paper, we aim to unveil the impact of data augmentation in audio-language multi-modal learning, which has not been explored despite its importance.
Ranked #2 on Audio to Text Retrieval on AudioCaps