Search Results for author: Jihyun Lee

Found 20 papers, 2 papers with code

Mirror: Multimodal Cognitive Reframing Therapy for Rolling with Resistance

no code implementations16 Apr 2025 Subin Kim, Hoonrae Kim, Jihyun Lee, Yejin Jeon, Gary Geunbae Lee

Recent studies have explored the use of large language models (LLMs) in psychotherapy; however, text-based cognitive behavioral therapy (CBT) models often struggle with client resistance, which can weaken therapeutic alliance.

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

no code implementations7 Apr 2025 Jihyun Lee, Weipeng Xu, Alexander Richard, Shih-En Wei, Shunsuke Saito, Shaojie Bai, Te-Li Wang, Minhyuk Sung, Tae-Kyun Kim, Jason Saragih

To enable real-time inference, we introduce (1) cascaded body-hand denoising diffusion, which effectively models the correlation between egocentric body and hand motions in a fast, feed-forward manner, and (2) diffusion distillation, which enables high-quality motion estimation with a single denoising step.

Denoising Motion Estimation

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

no code implementations28 Mar 2025 Yunhong Min, Daehyeon Choi, Kyeongmin Yeo, Jihyun Lee, Minhyuk Sung

We introduce ORIGEN, the first zero-shot method for 3D orientation grounding in text-to-image generation across multiple objects and diverse categories.

Text-to-Image Generation

Keyword-Aware ASR Error Augmentation for Robust Dialogue State Tracking

no code implementations10 Sep 2024 Jihyun Lee, Solee Im, Wonjun Lee, Gary Geunbae Lee

Dialogue State Tracking (DST) is a key part of task-oriented dialogue systems, identifying important information in conversations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Inference is All You Need: Self Example Retriever for Cross-domain Dialogue State Tracking with ChatGPT

no code implementations10 Sep 2024 Jihyun Lee, Gary Geunbae Lee

Traditional dialogue state tracking approaches heavily rely on extensive training data and handcrafted features, limiting their scalability and adaptability to new domains.

All Dialogue State Tracking +2

Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics

no code implementations6 Sep 2024 Woojin Cho, Jihyun Lee, Minjae Yi, Minje Kim, Taeyun Woo, Donghwan Kim, Taewook Ha, Hyokeun Lee, Je-Hwan Ryu, Woontack Woo, Tae-Kyun Kim

Accurate hand and object 3D meshes are obtained by fitting the hand parametric model (MANO) and the hand implicit function (HALO) to multi-view RGBD frames, with the MoCap system only for objects.

3D Hand Pose Estimation Object

Prediction-Feedback DETR for Temporal Action Detection

no code implementations29 Aug 2024 JiHwan Kim, Miso Lee, Cheol-Ho Cho, Jihyun Lee, Jae-Pil Heo

Temporal Action Detection (TAD) is fundamental yet challenging for real-world video applications.

Action Detection Prediction

InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse Diffusion

no code implementations CVPR 2024 Jihyun Lee, Shunsuke Saito, Giljoo Nam, Minhyuk Sung, Tae-Kyun Kim

Sampling from our model yields plausible and diverse two-hand shapes in close interaction with or without an object.

Diversity

BrainTalker: Low-Resource Brain-to-Speech Synthesis with Transfer Learning using Wav2Vec 2.0

no code implementations21 Dec 2023 Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang

Specifically, we train an encoder module to map ECoG signals to latent embeddings that match Wav2Vec 2. 0 representations of the corresponding spoken speech.

Speech Synthesis Transfer Learning

Style Modeling for Multi-Speaker Articulation-to-Speech

no code implementations21 Dec 2023 Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang

In this paper, we propose a neural articulation-to-speech (ATS) framework that synthesizes high-quality speech from articulatory signal in a multi-speaker situation.

Im2Hands: Learning Attentive Implicit Representation of Interacting Two-Hand Shapes

1 code implementation CVPR 2023 Jihyun Lee, Minhyuk Sung, Honggyu Choi, Tae-Kyun Kim

To handle the shape complexity and interaction context between two hands, Im2Hands models the occupancy volume of two hands - conditioned on an RGB image and coarse 3D keypoints - by two novel attention-based modules responsible for (1) initial occupancy estimation and (2) context-aware occupancy refinement, respectively.

Image Reconstruction Vocal Bursts Valence Prediction

SF-DST: Few-Shot Self-Feeding Reading Comprehension Dialogue State Tracking with Auxiliary Task

no code implementations16 Sep 2022 Jihyun Lee, Gary Geunbae Lee

Few-shot dialogue state tracking (DST) model tracks user requests in dialogue with reliable accuracy even with a small amount of data.

Dialogue State Tracking Reading Comprehension

Fast DCTTS: Efficient Deep Convolutional Text-to-Speech

no code implementations1 Apr 2021 Minsu Kang, Jihyun Lee, Simin Kim, Injung Kim

We propose an end-to-end speech synthesizer, Fast DCTTS, that synthesizes speech in real time on a single CPU thread.

Computational Efficiency Text to Speech

Cannot find the paper you are looking for? You can Submit a new open access paper.