1 code implementation • 8 Apr 2024 • Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang
This approach effectively synergizes reference image and text prompt information to produce valuable image features, facilitating an image diffusion model.
1 code implementation • 4 Feb 2024 • Faizan Farooq Khan, Diana Kim, Divyansh Jha, Youssef Mohamed, Hanna H Chang, Ahmed Elgammal, Luba Elliott, Mohamed Elhoseiny
Our comparative analysis is based on an extensive dataset, dubbed ``ArtConstellation,'' consisting of annotations about art principles, likability, and emotions for 6, 000 WikiArt and 3, 200 AI-generated artworks.
no code implementations • 8 Dec 2022 • Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain?
no code implementations • 5 Jan 2022 • Diana Kim, Ahmed Elgammal, Marian Mazzone
In this paper, we propose a novel proxy model and reformulate four pre-existing methods in the context of proxy learning.
no code implementations • 29 Sep 2021 • Bingchen Liu, Yizhe Zhu, Xiao Yang, Ahmed Elgammal
The VQSN module facilitates a more delicate separation of posture and identity, while the training scheme ensures the VQSN module learns the pose-related representations.
7 code implementations • ICLR 2021 • Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal
Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images.
Ranked #2 on Image Generation on ADE-Indoor
1 code implementation • 16 Dec 2020 • Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal
Moreover, with the proposed sketch generator, the model shows a promising performance on style mixing and style transfer, which require synthesized images to be both style-consistent and semantically meaningful.
no code implementations • 4 Oct 2020 • Mahyar Khayatkhoei, Ahmed Elgammal
As the success of Generative Adversarial Networks (GANs) on natural images quickly propels them into various real-life applications across different domains, it becomes more and more important to clearly understand their limitations.
no code implementations • 27 May 2020 • Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal
Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework.
1 code implementation • LREC 2020 • Mohamed Abdellatif, Ahmed Elgammal
In this paper, we approach the shared task OffenseEval 2020 by Mubarak et al. (2020) using ULMFiT Howard and Ruder (2018) pre-trained on Arabic Wikipedia Khooli (2019) which we use as a starting point and use the target data-set to fine-tune it.
no code implementations • LREC 2020 • Mohamed Abdellatif, Ahmed Elgammal
Authors: Mohamed Abdellatif and Ahmed Elgammal Gitlab URL: https://gitlab. com/abdollatif/lrec{\_}app Commit hash: 3f20b2ddb96d8c865e5f56f5566edf371214785f Tag name: Splits2 Dataset file md5: 5aee3dac5e48d1ac3d279083212734c9 Dataset URL: https://drive. google. com/file/d/1cv5HuQhgFVizupFI40dzreemS2gMM498/view? usp=sharing
1 code implementation • 26 Feb 2020 • Bingchen Liu, Kunpeng Song, Ahmed Elgammal
We propose a new approach for synthesizing fully detailed art-stylized images from sketches.
no code implementations • 16 Jul 2019 • Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas
We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.
1 code implementation • 26 May 2019 • Bingchen Liu, Yizhe Zhu, Zuohui Fu, Gerard de Melo, Ahmed Elgammal
Exploring the potential of GANs for unsupervised disentanglement learning, this paper proposes a novel GAN-based disentanglement framework with One-Hot Sampling and Orthogonal Regularization (OOGAN).
1 code implementation • ICCV 2019 • Yizhe Zhu, Jianwen Xie, Bingchen Liu, Ahmed Elgammal
We investigate learning feature-to-feature translator networks by alternating back-propagation as a general-purpose solution to zero-shot learning (ZSL) problems.
3 code implementations • CVPR 2019 • Ji Zhang, Kevin J. Shih, Ahmed Elgammal, Andrew Tao, Bryan Catanzaro
The first, Entity Instance Confusion, occurs when the model confuses multiple instances of the same type of entity (e. g. multiple cups).
no code implementations • NeurIPS 2019 • Yizhe Zhu, Jianwen Xie, Zhiqiang Tang, Xi Peng, Ahmed Elgammal
Zero-shot learning extends the conventional object classification to the unseen class recognition by introducing semantic representations of classes.
no code implementations • 21 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal
We propose an efficient and interpretable scene graph generator.
no code implementations • 1 Nov 2018 • Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal
This article describes the model we built that achieved 1st place in the OpenImage Visual Relationship Detection Challenge on Kaggle.
1 code implementation • NeurIPS 2018 • Mahyar Khayatkhoei, Ahmed Elgammal, Maneesh Singh
Natural images may lie on a union of disjoint manifolds rather than one globally connected manifold, and this can cause several difficulties for the training of common Generative Adversarial Networks (GANs).
2 code implementations • 27 Apr 2018 • Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny
Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.
no code implementations • 23 Jan 2018 • Ahmed Elgammal, Marian Mazzone, Bingchen Liu, Diana Kim, Mohamed Elhoseiny
How does the machine classify styles in art?
no code implementations • CVPR 2018 • Yizhe Zhu, Mohamed Elhoseiny, Bingchen Liu, Xi Peng, Ahmed Elgammal
Most existing zero-shot learning methods consider the problem as a visual semantic embedding one.
no code implementations • 8 Nov 2017 • Ahmed Elgammal, Yan Kang, Milko Den Leeuw
We also propose and compare different classification methods at the drawing level.
no code implementations • CVPR 2017 • Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal
We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.
no code implementations • ICCV 2017 • Yizhe Zhu, Ahmed Elgammal
The exponentially increasing use of moving platforms for video capture introduces the urgent need to develop the general background subtraction algorithms with the capability to deal with the moving background.
no code implementations • CVPR 2017 • Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal
We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.
10 code implementations • 21 Jun 2017 • Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, Marian Mazzone
We argue that such networks are limited in their ability to generate creative products in their original design.
no code implementations • 5 Jan 2017 • Mohamed Elhoseiny, Ahmed Elgammal
We present the Overlapping Domain Cover (ODC) notion for kernel machines, as a set of overlapping subsets of the data that covers the entire training set and optimized to be spatially cohesive as possible.
no code implementations • 29 Sep 2016 • Gabriel Moyà-Alcover, Ahmed Elgammal, Antoni Jaume-i-Capó, Javier Varona
In order to unify all the device channel cues, a new probabilistic depth data model is also proposed where we show how handle the inaccurate data to improve foreground segmentation.
no code implementations • CVPR 2016 • Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, Dimitris Metaxas
In this paper, we propose a new CNN architecture that integrates semantic part detection and abstraction (SPDA-CNN) for fine-grained classification.
no code implementations • WS 2016 • Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal
Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions.
no code implementations • 9 Feb 2016 • Babak Saleh, Ahmed Elgammal, Jacob Feldman
We focus on Convolutional Neural Networks (CNN) as the state-of-the-art models in object recognition and classification; investigate this problem in more detail, and hypothesize that training CNN models suffer from unstructured loss minimization.
no code implementations • 22 Jan 2016 • Amr Bakry, Ahmed Elgammal
Embedding the visual units on a manifold and using manifold kernels is one way to measure these distances.
no code implementations • 7 Jan 2016 • Chetan Tonde, Ahmed Elgammal
In this work we focus on learning kernel representations for structured regression.
no code implementations • 3 Jan 2016 • Praneeth Vepakomma, Chetan Tonde, Ahmed Elgammal
In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, Szekely et.
no code implementations • 31 Dec 2015 • Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the parameters of a linear classifier.
no code implementations • 4 Dec 2015 • Babak Saleh, Ahmed Elgammal, Jacob Feldman, Ali Farhadi
In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before.
no code implementations • 2 Dec 2015 • Mohamed Elhoseiny, Jingen Liu, Hui Cheng, Harpreet Sawhney, Ahmed Elgammal
To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) automatically determining relevance of concepts/attributes to a free text query, which could be useful for other applications, and (c) retrieving videos by free text event query (e. g., "changing a vehicle tire") based on their content.
no code implementations • 16 Nov 2015 • Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal
In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects.
no code implementations • 16 Nov 2015 • Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal
We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding.
no code implementations • 9 Aug 2015 • Amr Bakry, Mohamed Elhoseiny, Tarek El-Gaaly, Ahmed Elgammal
How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network?
no code implementations • 29 Jun 2015 • Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
In this paper we propose a framework for predicting kernelized classifiers in the visual domain for categories with no training images where the knowledge comes from textual description about these categories.
no code implementations • 2 Jun 2015 • Ahmed Elgammal, Babak Saleh
The proposed computational framework is based on constructing a network between creative products and using this network to infer about the originality and influence of its nodes.
1 code implementation • 5 May 2015 • Babak Saleh, Ahmed Elgammal
In the past few years, the number of fine-art collections that are digitized and publicly available has been growing rapidly.
no code implementations • 23 Mar 2015 • Haopeng Zhang, Tarek El-Gaaly, Ahmed Elgammal, Zhiguo Jiang
Due to large variations in shape, appearance, and viewing conditions, object recognition is a key precursory challenge in the fields of object manipulation and robotic/AI visual reasoning in general.
no code implementations • CVPR 2015 • Sheng Huang, Mohamed Elhoseiny, Ahmed Elgammal, Dan Yang
Then the attribute prediction problem is casted as a regularized hypergraph cut problem in which HAP jointly learns a collection of attribute projections from the feature space to a hypergraph embedding space aligned with the attribute space.
no code implementations • 9 Nov 2014 • Babak Saleh, Ali Farhadi, Ahmed Elgammal
When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.
no code implementations • 24 Oct 2014 • Sheng Huang, Ahmed Elgammal, Dan Yang
However, many studies on pairwise graphs show that the choice of edge weight can significantly influence the performances of such graph algorithms.
no code implementations • 30 Sep 2014 • Emily L. Spratt, Ahmed Elgammal
Through an investigation of the ethical consequences of this innovative technology, the unquestioned authority of the art expert is challenged and the subjective nature of aesthetic judgment is brought to philosophical scrutiny once again.
no code implementations • 26 Sep 2014 • Mohamed Elhoseiny, Ahmed Elgammal
In this paper, we present a generalized structured regression framework based on Shama-Mittal divergence, a relative entropy measure, which is introduced to the Machine Learning community in this work.
no code implementations • 14 Aug 2014 • Babak Saleh, Kanako Abe, Ravneet Singh Arora, Ahmed Elgammal
The contribution of this paper is in exploring the problem of computer-automated suggestion of influences between artists, a problem that was not addressed before in a general setting.
no code implementations • 1 Aug 2014 • Mohamed Elhoseiny, Ahmed Elgammal
This work firstly introduces MindMap Multilevel Visualization concept which is to jointly visualize and summarize textual information.
no code implementations • CVPR 2014 • Chetan Tonde, Ahmed Elgammal
We formulate this problem for learning covariances kernels of Twin Gaussian Processes.
no code implementations • 28 Dec 2013 • Sheng Huang, Dan Yang, Dong Yang, Ahmed Elgammal
In our algorithm, the discriminating power of DLPP are further exploited from two aspects.
no code implementations • 16 Sep 2013 • Ishani Chakraborty, Ahmed Elgammal
We consider the problem of naming objects in complex, natural scenes containing widely varying object appearance and subtly different names.
no code implementations • 11 Jun 2013 • Praneeth Vepakomma, Ahmed Elgammal
Our setting is different from subset-selection algorithms where the problem is to choose the best subset of features for regression.
no code implementations • CVPR 2013 • Babak Saleh, Ali Farhadi, Ahmed Elgammal
When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.
no code implementations • CVPR 2013 • Amr Bakry, Ahmed Elgammal
Our approach outperforms for the speaker semi-dependent setting by at least 15% of the baseline, and competes in the other two settings.