Search Results for author: Jan-Martin O. Steitz

xGQA: Cross-Lingual Visual Question Answering

In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.

Paper
Code

Reasoning over multiple modalities, e. g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains.

Paper
Add Code

First, we introduce a novel multi-view pooling layer to perform a 3D aggregation of 2D CNN-features extracted from each view.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.