no code implementations • 24 Nov 2023 • Eslam Mohamed BAKR, Liangbing Zhao, Vincent Tao Hu, Matthieu Cord, Patrick Perez, Mohamed Elhoseiny
Diffusion-based generative models excel in perceptually impressive synthesis but face challenges in interpretability.
1 code implementation • 10 Oct 2023 • Eslam Mohamed BAKR, Mohamed Ayman, Mahmoud Ahmed, Habib Slim, Mohamed Elhoseiny
To this end, we formulate the 3D visual grounding problem as a sequence-to-sequence Seq2Seq task by first predicting a chain of anchors and then the final target.
1 code implementation • ICCV 2023 • Eslam Mohamed BAKR, Pengzhan Sun, Xiaoqian Shen, Faizan Farooq Khan, Li Erran Li, Mohamed Elhoseiny
A human evaluation aligned with 95% of our evaluations on average was conducted to probe the effectiveness of HRS-Bench.
no code implementations • 10 Apr 2023 • Eslam Mohamed BAKR, Pengzhan Sun, Li Erran Li, Mohamed Elhoseiny
In addition, we design a formulation for measuring the bias of generated captions as prompt-based image captioning instead of using language classifiers.
1 code implementation • 25 Nov 2022 • Eslam Mohamed BAKR, Yasmeen Alsaedy, Mohamed Elhoseiny
The main question we address in this paper is "can we consolidate the 3D visual stream by 2D clues synthesized from point clouds and efficiently utilize them in training and testing?".
1 code implementation • 14 Nov 2022 • Eslam Mohamed BAKR, Ahmad El Sallab, Mohsen A. Rashwan
Recently, attention mechanisms have been explored with ConvNets, both across the spatial and channel dimensions.