Search Results for author: Ji Woo Hong

Found 8 papers, 2 papers with code

TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis

no code implementations8 Apr 2025 Tri Ton, Ji Woo Hong, Chang D. Yoo

This paper introduces Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning (TARO), a novel framework for high-fidelity and temporally coherent video-to-audio synthesis.

Audio Synthesis FAD

ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On

no code implementations CVPR 2025 Ji Woo Hong, Tri Ton, Trung X. Pham, Gwanhyeong Koo, Sunjae Yoon, Chang D. Yoo

This paper introduces ITA-MDT, the Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On (IVTON), designed to overcome the limitations of previous approaches by leveraging the Masked Diffusion Transformer (MDT) for improved handling of both global garment context and fine-grained details.

Denoising Virtual Try-on

E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization

no code implementations13 Feb 2025 Trung X. Pham, Zhang Kang, Ji Woo Hong, Xuran Zheng, Chang D. Yoo

We propose E-MD3C ($\underline{E}$fficient $\underline{M}$asked $\underline{D}$iffusion Transformer with Disentangled $\underline{C}$onditions and $\underline{C}$ompact $\underline{C}$ollector), a highly efficient framework for zero-shot object image customization.

Computational Efficiency Denoising +1

Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation

no code implementations16 Aug 2024 Tri Ton, Ji Woo Hong, SooHwan Eom, Jun Yeop Shim, Junyeong Kim, Chang D. Yoo

3D pathway generates spatially accurate class-agnostic mask proposals of common indoor objects from 3D point cloud data using a pre-trained 3D model, while 2D pathway utilizes pre-trained open-vocabulary instance segmentation model to identify a diverse array of object proposals from multi-view RGB-D images.

3D Instance Segmentation open vocabulary 3d instance segmentation +1

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

1 code implementation25 Jul 2024 Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong, Chang D. Yoo

Current image editing methods primarily utilize DDIM Inversion, employing a two-branch diffusion approach to preserve the attributes and layout of the original image.

Text-based Image Editing

Neutral Editing Framework for Diffusion-based Video Editing

no code implementations10 Dec 2023 Sunjae Yoon, Gwanhyeong Koo, Ji Woo Hong, Chang D. Yoo

To this end, this paper proposes Neutral Editing (NeuEdit) framework to enable complex non-rigid editing by changing the motion of a person/object in a video, which has never been attempted before.

Style Transfer Video Editing

Self-Supervised Visual Representation Learning via Residual Momentum

no code implementations17 Nov 2022 Trung X. Pham, Axi Niu, Zhang Kang, Sultan Rizky Madjid, Ji Woo Hong, Daehyeok Kim, Joshua Tian Jin Tee, Chang D. Yoo

To solve this problem, we propose "residual momentum" to directly reduce this gap to encourage the student to learn the representation as close to that of the teacher as possible, narrow the performance gap with the teacher, and significantly improve the existing SSL.

Contrastive Learning Representation Learning +1

Selective Query-guided Debiasing for Video Corpus Moment Retrieval

1 code implementation17 Oct 2022 Sunjae Yoon, Ji Woo Hong, Eunseop Yoon, Dahyun Kim, Junyeong Kim, Hee Suk Yoon, Chang D. Yoo

Video moment retrieval (VMR) aims to localize target moments in untrimmed videos pertinent to a given textual query.

Moment Retrieval Retrieval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.