no code implementations • 13 Feb 2023 • Henrik Voigt, Jan Hombeck, Monique Meuschke, Kai Lawonn, Sina Zarrieß
Existing language and vision models achieve impressive performance in image-text understanding.
Object Open-Ended Question Answering