2 code implementations • 13 Nov 2024 • Yeong-Joon Ju, Ho-Joong Kim, Seong-Whan Lee
Recent multimodal retrieval methods have endowed text-based retrievers with multimodal capabilities by utilizing pre-training strategies for visual-text alignment.
1 code implementation • 16 Feb 2024 • Ji-Hoon Park, Yeong-Joon Ju, Seong-Whan Lee
To address this issue, we propose the three research questions to interpret the diffusion process from the perspective of the visual concepts generated by the model and the region where the model attends in each time step.
1 code implementation • 11 Oct 2023 • Yeong-Joon Ju, Ji-Hoon Park, Seong-Whan Lee
We validate the effectiveness of our framework by addressing false correlations and improving inferences for classes with the worst performance in real-world settings.
2 code implementations • Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022 • Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, Seong-Whan Lee
In addition, the lack of high-quality paired data remains an obstacle for both methods.