1 code implementation • Findings (ACL) 2022 • Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, Iryna Gurevych
In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.
no code implementations • 9 Sep 2021 • Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, Stefan Roth
Reasoning over multiple modalities, e. g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains.
no code implementations • 4 Oct 2018 • Jan-Martin O. Steitz, Faraz Saeedan, Stefan Roth
First, we introduce a novel multi-view pooling layer to perform a 3D aggregation of 2D CNN-features extracted from each view.