no code implementations • 28 May 2024 • Sangmin Woo, Jaehyuk Jang, Donguk Kim, Yubin Choi, Changick Kim
By integrating the probability distributions from both the original and transformed images, RITUAL effectively reduces hallucinations.
no code implementations • 28 May 2024 • Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim
This study addresses the issue observed in Large Vision Language Models (LVLMs), where excessive attention on a few image tokens, referred to as blind tokens, leads to hallucinatory responses in tasks requiring fine-grained understanding of visual objects.