1 code implementation • 31 Jan 2019 • Hedi Ben-Younes, Rémi Cadene, Nicolas Thome, Matthieu Cord
We demonstrate the practical interest of our fusion model by using BLOCK for two challenging tasks: Visual Question Answering (VQA) and Visual Relationship Detection (VRD), where we design end-to-end learnable architectures for representing relevant interactions between modalities.
6 code implementations • ICCV 2017 • Hedi Ben-Younes, Rémi Cadene, Matthieu Cord, Nicolas Thome
Bilinear models provide an appealing framework for mixing and merging information in Visual Question Answering (VQA) tasks.
Ranked #35 on Visual Question Answering (VQA) on VQA v2 test-std