Search Results for author: Mohamed Elhoseiny

Found 60 papers, 28 papers with code

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes

1 code implementation ECCV 2020 Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas

Due to the scarcity and unsuitability of existent 3D-oriented linguistic resources for this task, we first develop two large-scale and complementary visio-linguistic datasets: i) extbf{ extit{Sr3D}}, which contains 83. 5K template-based utterances leveraging extit{spatial relations} with other fine-grained object classes to localize a referred object in a given scene, and ii) extbf{ extit{Nr3D}} which contains 41. 5K extit{natural, free-form}, utterances collected by deploying a 2-player object reference game in 3D scenes.

It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection

no code implementations15 Apr 2022 Youssef Mohamed, Faizan Farooq Khan, Kilichbek Haydarov, Mohamed Elhoseiny

As a step in this direction, the ArtEmis dataset was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations of these chosen emotions.

Image Captioning

Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification

no code implementations2 Mar 2022 Kai Yi, Xiaoqian Shen, Yunhao Gou, Mohamed Elhoseiny

The main question we address in this paper is how to scale up visual recognition of unseen classes, also known as zero-shot learning, to tens of thousands of categories as in the ImageNet-21K benchmark.

Image Classification Zero-Shot Image Classification +1

Efficiently Disentangle Causal Representations

1 code implementation6 Jan 2022 Yuanpeng Li, Joel Hestness, Mohamed Elhoseiny, Liang Zhao, Kenneth Church

This paper proposes an efficient approach to learning disentangled representations with causal mechanisms based on the difference of conditional probabilities in original and new distributions.

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

1 code implementation29 Dec 2021 Ivan Skorokhodov, Sergey Tulyakov, Mohamed Elhoseiny

We build our model on top of StyleGAN2 and it is just ${\approx}5\%$ more expensive to train at the same resolution while achieving almost the same image quality.

Domain-Aware Continual Zero-Shot Learning

no code implementations24 Dec 2021 Kai Yi, Mohamed Elhoseiny

To encourage the private network to capture the domain and task-specific representation, we train our model with a novel adversarial knowledge disentanglement setting to make our global network task-invariant and domain-invariant over all the tasks.

Disentanglement Zero-Shot Learning

CausalDyna: Improving Generalization of Dyna-style Reinforcement Learning via Counterfactual-Based Data Augmentation

no code implementations29 Sep 2021 Deyao Zhu, Li Erran Li, Mohamed Elhoseiny

Deep reinforcement learning agents trained in real-world environments with a limited diversity of object properties to learn manipulation tasks tend to suffer overfitting and fail to generalize to unseen testing environments.

Data Augmentation reinforcement-learning

RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition

1 code implementation24 Apr 2021 Jun Chen, Aniket Agarwal, Sherif Abdelkarim, Deyao Zhu, Mohamed Elhoseiny

This paper shows that modeling an effective message-passing flow through an attention mechanism can be critical to tackling the compositionality and long-tail challenges in VRR.

Image Captioning Object Recognition +5

Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation

1 code implementation20 Apr 2021 Divyansh Jha, Kai Yi, Ivan Skorokhodov, Mohamed Elhoseiny

By generating representations of unseen classes based on their semantic descriptions, e. g., attributes or text, generative ZSL attempts to differentiate unseen from seen categories.

Image Generation Zero-Shot Learning

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

1 code implementation20 Feb 2021 Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny

To the best of our knowledge, this is the first work that improves data efficiency of image captioning by utilizing LM pretrained on unimodal data.

Image Captioning Language Modelling +2

ArtEmis: Affective Language for Visual Art

2 code implementations CVPR 2021 Panos Achlioptas, Maks Ovsjanikov, Kilichbek Haydarov, Mohamed Elhoseiny, Leonidas Guibas

We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language.

Transferability of Compositionality

no code implementations1 Jan 2021 Yuanpeng Li, Liang Zhao, Joel Hestness, Ka Yee Lun, Kenneth Church, Mohamed Elhoseiny

To our best knowledge, this is the first work to focus on the transferability of compositionality, and it is orthogonal to existing efforts of learning compositional representations in training distribution.

Out-of-Distribution Generalization

Gradient Descent Resists Compositionality

no code implementations1 Jan 2021 Yuanpeng Li, Liang Zhao, Joel Hestness, Kenneth Church, Mohamed Elhoseiny

In this paper, we argue that gradient descent is one of the reasons that make compositionality learning hard during neural network optimization.

Class Normalization for Zero-Shot Learning

no code implementations ICLR 2021 Ivan Skorokhodov, Mohamed Elhoseiny

Normalization techniques have proved to be a crucial ingredient of successful training in a traditional supervised learning regime.

Zero-Shot Learning

Motion Forecasting with Unlikelihood Training

no code implementations1 Jan 2021 Deyao Zhu, Mohamed Zahran, Li Erran Li, Mohamed Elhoseiny

We propose a new objective, unlikelihood training, which forces generated trajectories that conflicts with contextual information to be assigned a lower probability by our model.

Motion Forecasting Trajectory Forecasting

CIZSL++: Creativity Inspired Generative Zero-Shot Learning

2 code implementations1 Jan 2021 Mohamed Elhoseiny, Kai Yi, Mohamed Elfeki

To improve the discriminative power of ZSL, we model the visual learning process of unseen categories with inspiration from the psychology of human creativity for producing novel art.

Transfer Learning Zero-Shot Learning

Class Normalization for (Continual)? Generalized Zero-Shot Learning

3 code implementations19 Jun 2020 Ivan Skorokhodov, Mohamed Elhoseiny

Normalization techniques have proved to be a crucial ingredient of successful training in a traditional supervised learning regime.

Generalized Zero-Shot Learning

Inner Ensemble Networks: Average Ensemble as an Effective Regularizer

1 code implementation15 Jun 2020 Abduallah Mohamed, Muhammed Mohaimin Sadiq, Ehab AlBadawy, Mohamed Elhoseiny, Christian Claudel

Also, we show empirically and theoretically that IENs lead to a greater variance reduction in comparison with other similar approaches such as dropout and maxout.

Neural Architecture Search

Learning Diverse Generations using Determinantal Point Processes

no code implementations ICLR 2019 Mohamed Elfeki, Camille Couprie, Mohamed Elhoseiny

Embedded in an adversarial training and variational autoencoder, our Generative DPP approach shows a consistent resistance to mode-collapse on a wide-variety of synthetic data and natural image datasets including MNIST, CIFAR10, and CelebA, while outperforming state-of-the-art methods for data-efficiency, convergence-time, and generation quality.

Point Processes

Creativity Inspired Zero-Shot Learning

2 code implementations ICCV 2019 Mohamed Elhoseiny, Mohamed Elfeki

We relate ZSL to human creativity by observing that zero-shot learning is about recognizing the unseen and creativity is about creating a likable unseen.

Transfer Learning Zero-Shot Learning

Semi-Supervised Few-Shot Learning with Prototypical Random Walks

1 code implementation6 Mar 2019 Ahmed Ayyad, Yuchen Li, Nassir Navab, Shadi Albarqouni, Mohamed Elhoseiny

We develop a random walk semi-supervised loss that enables the network to learn representations that are compact and well-separated.

Few-Shot Learning

Exploring the Challenges towards Lifelong Fact Learning

no code implementations26 Dec 2018 Mohamed Elhoseiny, Francesca Babiloni, Rahaf Aljundi, Marcus Rohrbach, Manohar Paluri, Tinne Tuytelaars

So far life-long learning (LLL) has been studied in relatively small-scale and relatively artificial setups.

Efficient Lifelong Learning with A-GEM

2 code implementations ICLR 2019 Arslan Chaudhry, Marc'Aurelio Ranzato, Marcus Rohrbach, Mohamed Elhoseiny

In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task.

Continual Learning

GDPP: Learning Diverse Generations Using Determinantal Point Process

4 code implementations30 Nov 2018 Mohamed Elfeki, Camille Couprie, Morgane Riviere, Mohamed Elhoseiny

Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic-looking images.

Uncertainty-guided Lifelong Learning in Bayesian Networks

no code implementations27 Sep 2018 Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach

Sequentially learning of tasks arriving in a continuous stream is a complex problem and becomes more challenging when the model has a fixed capacity.

Continual Learning

Choose Your Neuron: Incorporating Domain Knowledge through Neuron-Importance

1 code implementation ECCV 2018 Ramprasaath R. Selvaraju, Prithvijit Chattopadhyay, Mohamed Elhoseiny, Tilak Sharma, Dhruv Batra, Devi Parikh, Stefan Lee

Our approach, which we call Neuron Importance-AwareWeight Transfer (NIWT), learns to map domain knowledge about novel "unseen" classes onto this dictionary of learned concepts and then optimizes for network parameters that can effectively combine these concepts - essentially learning classifiers by discovering and composing learned semantic concepts in deep networks.

Generalized Zero-Shot Learning

Large-Scale Visual Relationship Understanding

2 code implementations27 Apr 2018 Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

DeSIGN: Design Inspiration from Generative Networks

1 code implementation3 Apr 2018 Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann Lecun, Camille Couprie

Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant?

Image Generation

Memory Aware Synapses: Learning what (not) to forget

2 code implementations ECCV 2018 Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach, Tinne Tuytelaars

We show state-of-the-art performance and, for the first time, the ability to adapt the importance of the parameters based on unlabeled data towards what the network needs (not) to forget, which may vary depending on test conditions.

Object Recognition

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

no code implementations CVPR 2017 Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.

Zero-Shot Learning

Relationship Proposal Networks

no code implementations CVPR 2017 Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal

We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.

Scene Understanding

CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms

10 code implementations21 Jun 2017 Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, Marian Mazzone

We argue that such networks are limited in their ability to generate creative products in their original design.

Overlapping Cover Local Regression Machines

no code implementations5 Jan 2017 Mohamed Elhoseiny, Ahmed Elgammal

We present the Overlapping Domain Cover (ODC) notion for kernel machines, as a set of overlapping subsets of the data that covers the entire training set and optimized to be spatially cohesive as possible.

GPR Pose Estimation

Automatic Annotation of Structured Facts in Images

no code implementations WS 2016 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions.

Write a Classifier: Predicting Visual Classifiers from Unstructured Text

no code implementations31 Dec 2015 Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh

Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the parameters of a linear classifier.

Transfer Learning

Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos

no code implementations2 Dec 2015 Mohamed Elhoseiny, Jingen Liu, Hui Cheng, Harpreet Sawhney, Ahmed Elgammal

To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) automatically determining relevance of concepts/attributes to a free text query, which could be useful for other applications, and (c) retrieving videos by free text event query (e. g., "changing a vehicle tire") based on their content.

Event Detection

Convolutional Models for Joint Object Categorization and Pose Estimation

no code implementations16 Nov 2015 Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal

In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects.

Object Categorization Object Recognition +1

Sherlock: Scalable Fact Learning in Images

no code implementations16 Nov 2015 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding.

Multiview Learning

Digging Deep into the layers of CNNs: In Search of How CNNs Achieve View Invariance

no code implementations9 Aug 2015 Amr Bakry, Mohamed Elhoseiny, Tarek El-Gaaly, Ahmed Elgammal

How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network?

Tell and Predict: Kernel Classifier Prediction for Unseen Visual Classes from Unstructured Text Descriptions

no code implementations29 Jun 2015 Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh

In this paper we propose a framework for predicting kernelized classifiers in the visual domain for categories with no training images where the knowledge comes from textual description about these categories.

Zero-Shot Learning

Learning Hypergraph-regularized Attribute Predictors

no code implementations CVPR 2015 Sheng Huang, Mohamed Elhoseiny, Ahmed Elgammal, Dan Yang

Then the attribute prediction problem is casted as a regularized hypergraph cut problem in which HAP jointly learns a collection of attribute projections from the feature space to a hypergraph embedding space aligned with the attribute space.

hypergraph embedding

Generalized Twin Gaussian Processes using Sharma-Mittal Divergence

no code implementations26 Sep 2014 Mohamed Elhoseiny, Ahmed Elgammal

In this paper, we present a generalized structured regression framework based on Shama-Mittal divergence, a relative entropy measure, which is introduced to the Machine Learning community in this work.

Gaussian Processes

Text to Multi-level MindMaps: A Novel Method for Hierarchical Visual Abstraction of Natural Language Text

no code implementations1 Aug 2014 Mohamed Elhoseiny, Ahmed Elgammal

This work firstly introduces MindMap Multilevel Visualization concept which is to jointly visualize and summarize textual information.

Cannot find the paper you are looking for? You can Submit a new open access paper.