2 code implementations • 23 Oct 2023 • Maryam Hashemi, Ghazaleh Mahmoudi, Sara Kodeiri, Hadi Sheikhi, Sauleh Eetemadi
Large-scale pretrained models such as LXMERT are becoming popular for learning cross-modal representations on text-image pairs for vision-language tasks.
Ranked #1 on
Visual Question Answering
on VQA v2 test-std
1 code implementation • 6 Aug 2023 • Fatemah Almeman, Hadi Sheikhi, Luis Espinosa-Anke
Definitions are a fundamental building block in lexicography, linguistics and computational semantics.