no code implementations • 27 Feb 2024 • George Eskandar, Chongzhe Zhang, Abhishek Kaushik, Karim Guirguis, Mohamed Sayed, Bin Yang
3D Object Detectors (3D-OD) are crucial for understanding the environment in many robotic tasks, especially autonomous driving.
no code implementations • 23 Jun 2023 • George Eskandar, Shuai Zhang, Mohamed Abdelsamad, Mark Youssef, Diandian Guo, Bin Yang
Data efficiency, or the ability to generalize from a few labeled data, remains a major challenge in deep learning.
1 code implementation • 16 May 2023 • George Eskandar, Diandian Guo, Karim Guirguis, Bin Yang
Second, in contrast to previous works which employ one discriminator that overfits the target domain semantic distribution, we employ a discriminator for the whole image and multiscale discriminators on the image patches.
1 code implementation • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022 • George Eskandar, Mohamed Abdelsamad, Karim Armanious, Shuai Zhang, Bin Yang
Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a semantic layout is used to generate a photorealistic image.
Ranked #11 on Image-to-Image Translation on ADE20K Labels-to-Photos
Multimodal Unsupervised Image-To-Image Translation Translation +1
no code implementations • 16 May 2023 • George Eskandar, Youssef Farag, Tarun Yenamandra, Daniel Cremers, Karim Guirguis, Bin Yang
Moreover, we employ an unsupervised latent exploration algorithm in the $\mathcal{S}$-space of the generator and show that it is more efficient than the conventional $\mathcal{W}^{+}$-space in controlling the image content.
no code implementations • CVPR 2023 • Karim Guirguis, Johannes Meier, George Eskandar, Matthias Kayser, Bin Yang, Juergen Beyerer
Our contribution is three-fold: (1) we design a standalone lightweight generator with (2) class-wise heads (3) to generate and replay diverse instance-level base features to the RoI head while finetuning on the novel data.
Data-free Knowledge Distillation Few-Shot Object Detection +2
no code implementations • 11 Oct 2022 • Karim Guirguis, Mohamed Abdelsamad, George Eskandar, Ahmed Hendawy, Matthias Kayser, Bin Yang, Juergen Beyerer
We make the observation that the large gap in performance between two-stage and one-stage FSODs are mainly due to their weak discriminability, which is explained by a small post-fusion receptive field and a small number of foreground samples in the loss function.
Ranked #13 on Few-Shot Object Detection on MS-COCO (10-shot)
no code implementations • 11 Apr 2022 • Karim Guirguis, Ahmed Hendawy, George Eskandar, Mohamed Abdelsamad, Matthias Kayser, Juergen Beyerer
In this work, we propose a constraint-based finetuning approach (CFA) to alleviate catastrophic forgetting, while achieving competitive results on the novel task without increasing the model capacity.
Ranked #8 on Few-Shot Object Detection on MS-COCO (10-shot)
no code implementations • 11 Apr 2022 • Karim Guirguis, George Eskandar, Matthias Kayser, Bin Yang, Juergen Beyerer
First, we leverage a meta-training paradigm, where we learn the domain shift on the base classes, then transfer the domain knowledge to the novel classes.
no code implementations • 7 Mar 2022 • George Eskandar, Robert A. Marsden, Pavithran Pandiyan, Mario Döbler, Karim Guirguis, Bin Yang
Integrating different representations from complementary sensing modalities is crucial for robust scene interpretation in autonomous driving.
no code implementations • 8 Feb 2022 • George Eskandar, Sanjeev Sudarsan, Karim Guirguis, Janaranjani Palaniswamy, Bharath Somashekar, Bin Yang
Lidar sensors are costly yet critical for understanding the 3D environment in autonomous driving.
1 code implementation • 29 Sep 2021 • George Eskandar, Mohamed Abdelsamad, Karim Armanious, Bin Yang
Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a photorealistic image is synthesized from a segmentation mask.
no code implementations • 19 Feb 2021 • George Eskandar, Alexander Braun, Martin Meinke, Karim Armanious, Bin Yang
Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames.