Search Results for author: Eslam Mohamed BAKR

Found 6 papers, 4 papers with code

ToddlerDiffusion: Flash Interpretable Controllable Diffusion Model

no code implementations24 Nov 2023 Eslam Mohamed BAKR, Liangbing Zhao, Vincent Tao Hu, Matthieu Cord, Patrick Perez, Mohamed Elhoseiny

Diffusion-based generative models excel in perceptually impressive synthesis but face challenges in interpretability.

Denoising Image Generation

CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding

1 code implementation10 Oct 2023 Eslam Mohamed BAKR, Mohamed Ayman, Mahmoud Ahmed, Habib Slim, Mohamed Elhoseiny

To this end, we formulate the 3D visual grounding problem as a sequence-to-sequence Seq2Seq task by first predicting a chain of anchors and then the final target.

Visual Grounding

ImageCaptioner$^2$: Image Captioner for Image Captioning Bias Amplification Assessment

no code implementations10 Apr 2023 Eslam Mohamed BAKR, Pengzhan Sun, Li Erran Li, Mohamed Elhoseiny

In addition, we design a formulation for measuring the bias of generated captions as prompt-based image captioning instead of using language classifiers.

Image Captioning

Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding

1 code implementation25 Nov 2022 Eslam Mohamed BAKR, Yasmeen Alsaedy, Mohamed Elhoseiny

The main question we address in this paper is "can we consolidate the 3D visual stream by 2D clues synthesized from point clouds and efficiently utilize them in training and testing?".

Knowledge Distillation Visual Grounding

PKCAM: Previous Knowledge Channel Attention Module

1 code implementation14 Nov 2022 Eslam Mohamed BAKR, Ahmad El Sallab, Mohsen A. Rashwan

Recently, attention mechanisms have been explored with ConvNets, both across the spatial and channel dimensions.

Image Classification object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.