no code implementations • 19 Jan 2024 • Haibi Wang, Weifeng Ge
With the breakthrough of multi-modal large language models, answering complex visual questions that demand advanced reasoning abilities and world knowledge has become a much more important testbed for developing AI models than ever.