no code implementations • 13 Feb 2025 • Trung X. Pham, Zhang Kang, Ji Woo Hong, Xuran Zheng, Chang D. Yoo
We propose E-MD3C ($\underline{E}$fficient $\underline{M}$asked $\underline{D}$iffusion Transformer with Disentangled $\underline{C}$onditions and $\underline{C}$ompact $\underline{C}$ollector), a highly efficient framework for zero-shot object image customization.
1 code implementation • 2 Feb 2024 • Trung X. Pham, Zhang Kang, Chang D. Yoo
Our compact 33MB model achieves an FID of 7. 42, surpassing a prior Unet latent diffusion approach (FID 8. 07) using only $11\times$ fewer parameters.
no code implementations • 17 Nov 2022 • Trung X. Pham, Axi Niu, Zhang Kang, Sultan Rizky Madjid, Ji Woo Hong, Daehyeok Kim, Joshua Tian Jin Tee, Chang D. Yoo
To solve this problem, we propose "residual momentum" to directly reduce this gap to encourage the student to learn the representation as close to that of the teacher as possible, narrow the performance gap with the teacher, and significantly improve the existing SSL.