1 code implementation • ECCV 2020 • Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas
Due to the scarcity and unsuitability of existent 3D-oriented linguistic resources for this task, we first develop two large-scale and complementary visio-linguistic datasets: i) extbf{ extit{Sr3D}}, which contains 83. 5K template-based utterances leveraging extit{spatial relations} with other fine-grained object classes to localize a referred object in a given scene, and ii) extbf{ extit{Nr3D}} which contains 41. 5K extit{natural, free-form}, utterances collected by deploying a 2-player object reference game in 3D scenes.
no code implementations • 15 Apr 2022 • Youssef Mohamed, Faizan Farooq Khan, Kilichbek Haydarov, Mohamed Elhoseiny
As a step in this direction, the ArtEmis dataset was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations of these chosen emotions.
1 code implementation • 6 Mar 2022 • Abduallah Mohamed, Deyao Zhu, Warren Vu, Mohamed Elhoseiny, Christian Claudel
AMD is a metric that quantifies how close the whole generated samples are to the ground truth.
Ranked #1 on
Trajectory Prediction
on ETH
no code implementations • 2 Mar 2022 • Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny
The main question we address in this paper is how to scale up visual recognition of unseen classes, also known as zero-shot learning, to tens of thousands of categories as in the ImageNet-21K benchmark.
1 code implementation • 6 Jan 2022 • Yuanpeng Li, Joel Hestness, Mohamed Elhoseiny, Liang Zhao, Kenneth Church
This paper proposes an efficient approach to learning disentangled representations with causal mechanisms based on the difference of conditional probabilities in original and new distributions.
1 code implementation • 29 Dec 2021 • Ivan Skorokhodov, Sergey Tulyakov, Mohamed Elhoseiny
We build our model on top of StyleGAN2 and it is just ${\approx}5\%$ more expensive to train at the same resolution while achieving almost the same image quality.
no code implementations • 24 Dec 2021 • Kai Yi, Mohamed Elhoseiny
To encourage the private network to capture the domain and task-specific representation, we train our model with a novel adversarial knowledge disentanglement setting to make our global network task-invariant and domain-invariant over all the tasks.
no code implementations • 29 Sep 2021 • Deyao Zhu, Li Erran Li, Mohamed Elhoseiny
Deep reinforcement learning agents trained in real-world environments with a limited diversity of object properties to learn manipulation tasks tend to suffer overfitting and fail to generalize to unseen testing environments.
no code implementations • 29 Sep 2021 • Kilichbek Haydarov, Aashiq Muhamed, Jovana Lazarevic, Ivan Skorokhodov, Mohamed Elhoseiny
To the best of our knowledge, our work is the first one which explores text-controllable continuous image generation.
1 code implementation • 24 Apr 2021 • Jun Chen, Aniket Agarwal, Sherif Abdelkarim, Deyao Zhu, Mohamed Elhoseiny
This paper shows that modeling an effective message-passing flow through an attention mechanism can be critical to tackling the compositionality and long-tail challenges in VRR.
1 code implementation • 20 Apr 2021 • Divyansh Jha, Kai Yi, Ivan Skorokhodov, Mohamed Elhoseiny
By generating representations of unseen classes based on their semantic descriptions, e. g., attributes or text, generative ZSL attempts to differentiate unseen from seen categories.
1 code implementation • ICCV 2021 • Ivan Skorokhodov, Grigorii Sotnikov, Mohamed Elhoseiny
In this work, we develop a method to generate infinite high-resolution images with diverse and complex content.
Ranked #1 on
Infinite Image Generation
on LHQ
1 code implementation • 20 Feb 2021 • Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
To the best of our knowledge, this is the first work that improves data efficiency of image captioning by utilizing LM pretrained on unimodal data.
2 code implementations • CVPR 2021 • Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas Guibas
We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language.
no code implementations • 1 Jan 2021 • Yuanpeng Li, Liang Zhao, Joel Hestness, Ka Yee Lun, Kenneth Church, Mohamed Elhoseiny
To our best knowledge, this is the first work to focus on the transferability of compositionality, and it is orthogonal to existing efforts of learning compositional representations in training distribution.
no code implementations • 1 Jan 2021 • Yuanpeng Li, Liang Zhao, Joel Hestness, Kenneth Church, Mohamed Elhoseiny
In this paper, we argue that gradient descent is one of the reasons that make compositionality learning hard during neural network optimization.
no code implementations • ICLR 2021 • Ivan Skorokhodov, Mohamed Elhoseiny
Normalization techniques have proved to be a crucial ingredient of successful training in a traditional supervised learning regime.
no code implementations • 1 Jan 2021 • Deyao Zhu, Mohamed Zahran, Li Erran Li, Mohamed Elhoseiny
We propose a new objective, unlikelihood training, which forces generated trajectories that conflicts with contextual information to be assigned a lower probability by our model.
no code implementations • ICLR 2021 • Deyao Zhu, Mohamed Zahran, Li Erran Li, Mohamed Elhoseiny
Our model's learned representation leads to better and more semantically meaningful coverage of the trajectory distribution.
2 code implementations • 1 Jan 2021 • Mohamed Elhoseiny, Kai Yi, Mohamed Elfeki
To improve the discriminative power of ZSL, we model the visual learning process of unseen categories with inspiration from the psychology of human creativity for producing novel art.
1 code implementation • CVPR 2021 • Ivan Skorokhodov, Savva Ignatyev, Mohamed Elhoseiny
In most existing learning systems, images are typically viewed as 2D pixel arrays.
Ranked #7 on
Image Generation
on FFHQ 256 x 256
no code implementations • NeurIPS 2020 • Uchenna Akujuobi, Jun Chen, Mohamed Elhoseiny, Michael Spranger, Xiangliang Zhang
Then, the key is to capture the temporal evolution of node pair (term pair) relations from just the positive and unlabeled data.
3 code implementations • 19 Jun 2020 • Ivan Skorokhodov, Mohamed Elhoseiny
Normalization techniques have proved to be a crucial ingredient of successful training in a traditional supervised learning regime.
1 code implementation • 15 Jun 2020 • Abduallah Mohamed, Muhammed Mohaimin Sadiq, Ehab AlBadawy, Mohamed Elhoseiny, Christian Claudel
Also, we show empirically and theoretically that IENs lead to a greater variance reduction in comparison with other similar approaches such as dropout and maxout.
1 code implementation • ICLR 2020 • Yuanpeng Li, Liang Zhao, Kenneth Church, Mohamed Elhoseiny
It also shows significant improvement in machine translation task.
no code implementations • Conference 2020 • Jun Chen, Robert Hoehndorf, Mohamed Elhoseiny, Xiangliang Zhang
In natural language processing, relation extraction seeks to rationally understand unstructured text.
Ranked #11 on
Relation Extraction
on TACRED
3 code implementations • ICCV 2021 • Sherif Abdelkarim, Aniket Agarwal, Panos Achlioptas, Jun Chen, Jiaji Huang, Boyang Li, Kenneth Church, Mohamed Elhoseiny
We use these benchmarks to study the performance of several state-of-the-art long-tail models on the LTVRR setup.
1 code implementation • CVPR 2020 • Abduallah Mohamed, Kun Qian, Mohamed Elhoseiny, Christian Claudel
Better machine understanding of pedestrian behaviors enables faster progress in modeling interactions between agents such as autonomous vehicles and humans.
Ranked #3 on
Trajectory Prediction
on ETH
2 code implementations • ICLR 2020 • Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach
Continual learning aims to learn new tasks without forgetting previously learned ones.
no code implementations • ICLR 2019 • Mohamed Elfeki, Camille Couprie, Mohamed Elhoseiny
Embedded in an adversarial training and variational autoencoder, our Generative DPP approach shows a consistent resistance to mode-collapse on a wide-variety of synthetic data and natural image datasets including MNIST, CIFAR10, and CelebA, while outperforming state-of-the-art methods for data-efficiency, convergence-time, and generation quality.
2 code implementations • ICCV 2019 • Mohamed Elhoseiny, Mohamed Elfeki
We relate ZSL to human creativity by observing that zero-shot learning is about recognizing the unseen and creativity is about creating a likable unseen.
1 code implementation • 6 Mar 2019 • Ahmed Ayyad, Yuchen Li, Nassir Navab, Shadi Albarqouni, Mohamed Elhoseiny
We develop a random walk semi-supervised loss that enables the network to learn representations that are compact and well-separated.
4 code implementations • 27 Feb 2019 • Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K. Dokania, Philip H. S. Torr, Marc'Aurelio Ranzato
But for a successful knowledge transfer, the learner needs to remember how to perform previous tasks.
no code implementations • 26 Dec 2018 • Mohamed Elhoseiny, Francesca Babiloni, Rahaf Aljundi, Marcus Rohrbach, Manohar Paluri, Tinne Tuytelaars
So far life-long learning (LLL) has been studied in relatively small-scale and relatively artificial setups.
2 code implementations • ICLR 2019 • Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, Mohamed Elhoseiny
In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task.
Ranked #6 on
Continual Learning
on ASC (19 tasks)
4 code implementations • 30 Nov 2018 • Mohamed Elfeki, Camille Couprie, Morgane Riviere, Mohamed Elhoseiny
Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic-looking images.
1 code implementation • 17 Oct 2018 • Mennatullah Siam, Chen Jiang, Steven Lu, Laura Petrich, Mahmoud Gamal, Mohamed Elhoseiny, Martin Jagersand
A human teacher can show potential objects of interest to the robot, which is able to self adapt to the teaching signal without providing manual segmentation labels.
Ranked #13 on
Unsupervised Video Object Segmentation
on DAVIS 2016
no code implementations • 27 Sep 2018 • Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach
Sequentially learning of tasks arriving in a continuous stream is a complex problem and becomes more challenging when the model has a fixed capacity.
1 code implementation • ECCV 2018 • Ramprasaath R. Selvaraju, Prithvijit Chattopadhyay, Mohamed Elhoseiny, Tilak Sharma, Dhruv Batra, Devi Parikh, Stefan Lee
Our approach, which we call Neuron Importance-AwareWeight Transfer (NIWT), learns to map domain knowledge about novel "unseen" classes onto this dictionary of learned concepts and then optimizes for network parameters that can effectively combine these concepts - essentially learning classifiers by discovering and composing learned semantic concepts in deep networks.
2 code implementations • 27 Apr 2018 • Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny
Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.
1 code implementation • 3 Apr 2018 • Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann Lecun, Camille Couprie
Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant?
no code implementations • 23 Jan 2018 • Ahmed Elgammal, Marian Mazzone, Bingchen Liu, Diana Kim, Mohamed Elhoseiny
How does the machine classify styles in art?
no code implementations • CVPR 2018 • Yizhe Zhu, Mohamed Elhoseiny, Bingchen Liu, Xi Peng, Ahmed Elgammal
Most existing zero-shot learning methods consider the problem as a visual semantic embedding one.
2 code implementations • ECCV 2018 • Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, Tinne Tuytelaars
We show state-of-the-art performance and, for the first time, the ability to adapt the importance of the parameters based on unlabeled data towards what the network needs (not) to forget, which may vary depending on test conditions.
no code implementations • CVPR 2017 • Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal
We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.
no code implementations • CVPR 2017 • Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal
We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.
10 code implementations • 21 Jun 2017 • Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, Marian Mazzone
We argue that such networks are limited in their ability to generate creative products in their original design.
no code implementations • 5 Jan 2017 • Mohamed Elhoseiny, Ahmed Elgammal
We present the Overlapping Domain Cover (ODC) notion for kernel machines, as a set of overlapping subsets of the data that covers the entire training set and optimized to be spatially cohesive as possible.
no code implementations • CVPR 2016 • Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, Dimitris Metaxas
In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification.
no code implementations • WS 2016 • Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal
Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions.
no code implementations • 31 Dec 2015 • Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the parameters of a linear classifier.
no code implementations • 2 Dec 2015 • Mohamed Elhoseiny, Jingen Liu, Hui Cheng, Harpreet Sawhney, Ahmed Elgammal
To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) automatically determining relevance of concepts/attributes to a free text query, which could be useful for other applications, and (c) retrieving videos by free text event query (e. g., "changing a vehicle tire") based on their content.
no code implementations • 16 Nov 2015 • Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal
In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects.
no code implementations • 16 Nov 2015 • Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal
We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding.
no code implementations • 9 Aug 2015 • Amr Bakry, Mohamed Elhoseiny, Tarek El-Gaaly, Ahmed Elgammal
How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network?
no code implementations • 29 Jun 2015 • Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
In this paper we propose a framework for predicting kernelized classifiers in the visual domain for categories with no training images where the knowledge comes from textual description about these categories.
no code implementations • CVPR 2015 • Sheng Huang, Mohamed Elhoseiny, Ahmed Elgammal, Dan Yang
Then the attribute prediction problem is casted as a regularized hypergraph cut problem in which HAP jointly learns a collection of attribute projections from the feature space to a hypergraph embedding space aligned with the attribute space.
no code implementations • 26 Sep 2014 • Mohamed Elhoseiny, Ahmed Elgammal
In this paper, we present a generalized structured regression framework based on Shama-Mittal divergence, a relative entropy measure, which is introduced to the Machine Learning community in this work.
no code implementations • 1 Aug 2014 • Mohamed Elhoseiny, Ahmed Elgammal
This work firstly introduces MindMap Multilevel Visualization concept which is to jointly visualize and summarize textual information.