no code implementations • 13 Feb 2024 • Michael Dorkenwald, Nimrod Barazani, Cees G. M. Snoek, Yuki M. Asano
Vision-Language Models (VLMs), such as Flamingo and GPT-4V, have shown immense potential by integrating large language models with vision systems.