no code implementations • 28 May 2024 • Sangmin Woo, Jaehyuk Jang, Donguk Kim, Yubin Choi, Changick Kim
By integrating the probability distributions from both the original and transformed images, RITUAL effectively reduces hallucinations.
no code implementations • 28 May 2024 • Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim
This study addresses the issue observed in Large Vision Language Models (LVLMs), where excessive attention on a few image tokens, referred to as blind tokens, leads to hallucinatory responses in tasks requiring fine-grained understanding of visual objects.
no code implementations • 26 Dec 2023 • Jaehyuk Jang, Yooseung Wang, Changick Kim
Recently, multimodal prompting, which introduces learnable missing-aware prompts for all missing modality cases, has exhibited impressive performance.