Search Results for author: Walid Bousselham

Found 4 papers, 4 papers with code

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

1 code implementation4 Apr 2024 Walid Bousselham, Angie Boggust, Sofian Chaybouti, Hendrik Strobelt, Hilde Kuehne

Vision Transformers (ViTs), with their ability to model long-range dependencies through self-attention mechanisms, have become a standard architecture in computer vision.

Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

1 code implementation1 Dec 2023 Walid Bousselham, Felix Petersen, Vittorio Ferrari, Hilde Kuehne

To leverage those capabilities, we propose a Grounding Everything Module (GEM) that generalizes the idea of value-value attention introduced by CLIPSurgery to a self-self attention path.

Image Retrieval Object Localization +2

Learning Situation Hyper-Graphs for Video Question Answering

1 code implementation CVPR 2023 Aisha Urooj Khan, Hilde Kuehne, Bo Wu, Kim Chheu, Walid Bousselham, Chuang Gan, Niels Lobo, Mubarak Shah

The proposed method is trained in an end-to-end manner and optimized by a VQA loss with the cross-entropy function and a Hungarian matching loss for the situation graph prediction.

Ranked #6 on Video Question Answering on AGQA 2.0 balanced (Average Accuracy metric)

Question Answering Video Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.