Search Results for author: Ahmed Elgammal

Found 60 papers, 13 papers with code

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

1 code implementation8 Apr 2024 Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang

This approach effectively synergizes reference image and text prompt information to produce valuable image features, facilitating an image diffusion model.

Image-to-Image Translation Language Modelling +3

AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art

1 code implementation4 Feb 2024 Faizan Farooq Khan, Diana Kim, Divyansh Jha, Youssef Mohamed, Hanna H Chang, Ahmed Elgammal, Luba Elliott, Mohamed Elhoseiny

Our comparative analysis is based on an extensive dataset, dubbed ``ArtConstellation,'' consisting of annotations about art principles, likability, and emotions for 6, 000 WikiArt and 3, 200 AI-generated artworks.

Diffusion Guided Domain Adaptation of Image Generators

no code implementations8 Dec 2022 Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal

Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain?

Domain Adaptation

Formal Analysis of Art: Proxy Learning of Visual Concepts from Style Through Language Models

no code implementations5 Jan 2022 Diana Kim, Ahmed Elgammal, Marian Mazzone

In this paper, we propose a novel proxy model and reformulate four pre-existing methods in the context of proxy learning.

Language Modelling

PIVQGAN: Posture and Identity Disentangled Image-to-Image Translation via Vector Quantization

no code implementations29 Sep 2021 Bingchen Liu, Yizhe Zhu, Xiao Yang, Ahmed Elgammal

The VQSN module facilitates a more delicate separation of posture and identity, while the training scheme ensures the VQSN module learns the pose-related representations.

Disentanglement Image-to-Image Translation +2

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

7 code implementations ICLR 2021 Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal

Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images.

Image Generation

Self-Supervised Sketch-to-Image Synthesis

1 code implementation16 Dec 2020 Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal

Moreover, with the proposed sketch generator, the model shows a promising performance on style mixing and style transfer, which require synthesized images to be both style-consistent and semantically meaningful.

Image Generation Self-Supervised Learning +1

Spatial Frequency Bias in Convolutional Generative Adversarial Networks

no code implementations4 Oct 2020 Mahyar Khayatkhoei, Ahmed Elgammal

As the success of Generative Adversarial Networks (GANs) on natural images quickly propels them into various real-life applications across different domains, it becomes more and more important to clearly understand their limitations.

Denoising Super-Resolution

TIME: Text and Image Mutual-Translation Adversarial Networks

no code implementations27 May 2020 Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal

Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework.

Generative Adversarial Network Image Captioning +3

Offensive language detection in Arabic using ULMFiT

1 code implementation LREC 2020 Mohamed Abdellatif, Ahmed Elgammal

In this paper, we approach the shared task OffenseEval 2020 by Mubarak et al. (2020) using ULMFiT Howard and Ruder (2018) pre-trained on Arabic Wikipedia Khooli (2019) which we use as a starting point and use the target data-set to fine-tune it.

General Classification Language Modelling +3

ULMFiT replication

no code implementations LREC 2020 Mohamed Abdellatif, Ahmed Elgammal

Authors: Mohamed Abdellatif and Ahmed Elgammal Gitlab URL: https://gitlab. com/abdollatif/lrec{\_}app Commit hash: 3f20b2ddb96d8c865e5f56f5566edf371214785f Tag name: Splits2 Dataset file md5: 5aee3dac5e48d1ac3d279083212734c9 Dataset URL: https://drive. google. com/file/d/1cv5HuQhgFVizupFI40dzreemS2gMM498/view? usp=sharing

TAG

Sketch-to-Art: Synthesizing Stylized Art Images From Sketches

1 code implementation26 Feb 2020 Bingchen Liu, Kunpeng Song, Ahmed Elgammal

We propose a new approach for synthesizing fully detailed art-stylized images from sketches.

2nd Place Solution to the GQA Challenge 2019

no code implementations16 Jul 2019 Shijie Geng, Ji Zhang, Hang Zhang, Ahmed Elgammal, Dimitris N. Metaxas

We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering.

Question Answering Visual Question Answering +1

OOGAN: Disentangling GAN with One-Hot Sampling and Orthogonal Regularization

1 code implementation26 May 2019 Bingchen Liu, Yizhe Zhu, Zuohui Fu, Gerard de Melo, Ahmed Elgammal

Exploring the potential of GANs for unsupervised disentanglement learning, this paper proposes a novel GAN-based disentanglement framework with One-Hot Sampling and Orthogonal Regularization (OOGAN).

Disentanglement

Learning Feature-to-Feature Translator by Alternating Back-Propagation for Generative Zero-Shot Learning

1 code implementation ICCV 2019 Yizhe Zhu, Jianwen Xie, Bingchen Liu, Ahmed Elgammal

We investigate learning feature-to-feature translator networks by alternating back-propagation as a general-purpose solution to zero-shot learning (ZSL) problems.

Zero-Shot Learning

Graphical Contrastive Losses for Scene Graph Parsing

3 code implementations CVPR 2019 Ji Zhang, Kevin J. Shih, Ahmed Elgammal, Andrew Tao, Bryan Catanzaro

The first, Entity Instance Confusion, occurs when the model confuses multiple instances of the same type of entity (e. g. multiple cups).

Relationship Detection Scene Graph Generation +1

Semantic-Guided Multi-Attention Localization for Zero-Shot Learning

no code implementations NeurIPS 2019 Yizhe Zhu, Jianwen Xie, Zhiqiang Tang, Xi Peng, Ahmed Elgammal

Zero-shot learning extends the conventional object classification to the unseen class recognition by introducing semantic representations of classes.

Triplet Zero-Shot Learning

Introduction to the 1st Place Winning Model of OpenImages Relationship Detection Challenge

no code implementations1 Nov 2018 Ji Zhang, Kevin Shih, Andrew Tao, Bryan Catanzaro, Ahmed Elgammal

This article describes the model we built that achieved 1st place in the OpenImage Visual Relationship Detection Challenge on Kaggle.

Relationship Detection Visual Relationship Detection

Disconnected Manifold Learning for Generative Adversarial Networks

1 code implementation NeurIPS 2018 Mahyar Khayatkhoei, Ahmed Elgammal, Maneesh Singh

Natural images may lie on a union of disjoint manifolds rather than one globally connected manifold, and this can cause several difficulties for the training of common Generative Adversarial Networks (GANs).

Large-Scale Visual Relationship Understanding

2 code implementations27 Apr 2018 Ji Zhang, Yannis Kalantidis, Marcus Rohrbach, Manohar Paluri, Ahmed Elgammal, Mohamed Elhoseiny

Large scale visual understanding is challenging, as it requires a model to handle the widely-spread and imbalanced distribution of <subject, relation, object> triples.

Relationship Detection

Link the head to the "beak": Zero Shot Learning from Noisy Text Description at Part Precision

no code implementations CVPR 2017 Mohamed Elhoseiny, Yizhe Zhu, Han Zhang, Ahmed Elgammal

We propose a learning framework that is able to connect text terms to its relevant parts and suppress connections to non-visual text terms without any part-text annotations.

Zero-Shot Learning

A Multilayer-Based Framework for Online Background Subtraction with Freely Moving Cameras

no code implementations ICCV 2017 Yizhe Zhu, Ahmed Elgammal

The exponentially increasing use of moving platforms for video capture introduces the urgent need to develop the general background subtraction algorithms with the capability to deal with the moving background.

Segmentation

Relationship Proposal Networks

no code implementations CVPR 2017 Ji Zhang, Mohamed Elhoseiny, Scott Cohen, Walter Chang, Ahmed Elgammal

We demonstrate the ability of our Rel-PN to localize relationships with only a few thousand proposals.

Scene Understanding

CAN: Creative Adversarial Networks, Generating "Art" by Learning About Styles and Deviating from Style Norms

10 code implementations21 Jun 2017 Ahmed Elgammal, Bingchen Liu, Mohamed Elhoseiny, Marian Mazzone

We argue that such networks are limited in their ability to generate creative products in their original design.

Overlapping Cover Local Regression Machines

no code implementations5 Jan 2017 Mohamed Elhoseiny, Ahmed Elgammal

We present the Overlapping Domain Cover (ODC) notion for kernel machines, as a set of overlapping subsets of the data that covers the entire training set and optimized to be spatially cohesive as possible.

GPR Pose Estimation +1

Modelling depth for nonparametric foreground segmentation using RGBD devices

no code implementations29 Sep 2016 Gabriel Moyà-Alcover, Ahmed Elgammal, Antoni Jaume-i-Capó, Javier Varona

In order to unify all the device channel cues, a new probabilistic depth data model is also proposed where we show how handle the inaccurate data to improve foreground segmentation.

Foreground Segmentation Segmentation

Automatic Annotation of Structured Facts in Images

no code implementations WS 2016 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

Motivated by the application of fact-level image understanding, we present an automatic method for data collection of structured visual facts from images with captions.

The Role of Typicality in Object Classification: Improving The Generalization Capacity of Convolutional Neural Networks

no code implementations9 Feb 2016 Babak Saleh, Ahmed Elgammal, Jacob Feldman

We focus on Convolutional Neural Networks (CNN) as the state-of-the-art models in object recognition and classification; investigate this problem in more detail, and hypothesize that training CNN models suffer from unstructured loss minimization.

General Classification Object Recognition

Manifold-Kernels Comparison in MKPLS for Visual Speech Recognition

no code implementations22 Jan 2016 Amr Bakry, Ahmed Elgammal

Embedding the visual units on a manifold and using manifold kernels is one way to measure these distances.

speech-recognition Visual Speech Recognition

Supervised Dimensionality Reduction via Distance Correlation Maximization

no code implementations3 Jan 2016 Praneeth Vepakomma, Chetan Tonde, Ahmed Elgammal

In our work, we propose a novel formulation for supervised dimensionality reduction based on a nonlinear dependency criterion called Statistical Distance Correlation, Szekely et.

regression Supervised dimensionality reduction

Write a Classifier: Predicting Visual Classifiers from Unstructured Text

no code implementations31 Dec 2015 Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh

Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the parameters of a linear classifier.

regression Transfer Learning

Toward a Taxonomy and Computational Models of Abnormalities in Images

no code implementations4 Dec 2015 Babak Saleh, Ahmed Elgammal, Jacob Feldman, Ali Farhadi

In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before.

Zero-Shot Event Detection by Multimodal Distributional Semantic Embedding of Videos

no code implementations2 Dec 2015 Mohamed Elhoseiny, Jingen Liu, Hui Cheng, Harpreet Sawhney, Ahmed Elgammal

To our knowledge, this is the first Zero-Shot event detection model that is built on top of distributional semantics and extends it in the following directions: (a) semantic embedding of multimodal information in videos (with focus on the visual modalities), (b) automatically determining relevance of concepts/attributes to a free text query, which could be useful for other applications, and (c) retrieving videos by free text event query (e. g., "changing a vehicle tire") based on their content.

Event Detection

Convolutional Models for Joint Object Categorization and Pose Estimation

no code implementations16 Nov 2015 Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal

In the task of Object Recognition, there exists a dichotomy between the categorization of objects and estimating object pose, where the former necessitates a view-invariant representation, while the latter requires a representation capable of capturing pose information over different categories of objects.

Object Object Categorization +2

Sherlock: Scalable Fact Learning in Images

no code implementations16 Nov 2015 Mohamed Elhoseiny, Scott Cohen, Walter Chang, Brian Price, Ahmed Elgammal

We show that learning visual facts in a structured way enables not only a uniform but also generalizable visual understanding.

Multiview Learning Retrieval

Digging Deep into the layers of CNNs: In Search of How CNNs Achieve View Invariance

no code implementations9 Aug 2015 Amr Bakry, Mohamed Elhoseiny, Tarek El-Gaaly, Ahmed Elgammal

How does fine-tuning of a pre-trained CNN on a multi-view dataset affect the representation at each layer of the network?

Tell and Predict: Kernel Classifier Prediction for Unseen Visual Classes from Unstructured Text Descriptions

no code implementations29 Jun 2015 Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh

In this paper we propose a framework for predicting kernelized classifiers in the visual domain for categories with no training images where the knowledge comes from textual description about these categories.

Zero-Shot Learning

Quantifying Creativity in Art Networks

no code implementations2 Jun 2015 Ahmed Elgammal, Babak Saleh

The proposed computational framework is based on constructing a network between creative products and using this network to infer about the originality and influence of its nodes.

Large-scale Classification of Fine-Art Paintings: Learning The Right Metric on The Right Feature

1 code implementation5 May 2015 Babak Saleh, Ahmed Elgammal

In the past few years, the number of fine-art collections that are digitized and publicly available has been growing rapidly.

General Classification Metric Learning

Factorization of View-Object Manifolds for Joint Object Recognition and Pose Estimation

no code implementations23 Mar 2015 Haopeng Zhang, Tarek El-Gaaly, Ahmed Elgammal, Zhiguo Jiang

Due to large variations in shape, appearance, and viewing conditions, object recognition is a key precursory challenge in the fields of object manipulation and robotic/AI visual reasoning in general.

Object Object Recognition +2

Learning Hypergraph-regularized Attribute Predictors

no code implementations CVPR 2015 Sheng Huang, Mohamed Elhoseiny, Ahmed Elgammal, Dan Yang

Then the attribute prediction problem is casted as a regularized hypergraph cut problem in which HAP jointly learns a collection of attribute projections from the feature space to a hypergraph embedding space aligned with the attribute space.

Attribute hypergraph embedding

Abnormal Object Recognition: A Comprehensive Study

no code implementations9 Nov 2014 Babak Saleh, Ali Farhadi, Ahmed Elgammal

When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.

Anomaly Detection Object +1

On The Effect of Hyperedge Weights On Hypergraph Learning

no code implementations24 Oct 2014 Sheng Huang, Ahmed Elgammal, Dan Yang

However, many studies on pairwise graphs show that the choice of edge weight can significantly influence the performances of such graph algorithms.

Clustering Graph Learning

Computational Beauty: Aesthetic Judgment at the Intersection of Art and Science

no code implementations30 Sep 2014 Emily L. Spratt, Ahmed Elgammal

Through an investigation of the ethical consequences of this innovative technology, the unquestioned authority of the art expert is challenged and the subjective nature of aesthetic judgment is brought to philosophical scrutiny once again.

Generalized Twin Gaussian Processes using Sharma-Mittal Divergence

no code implementations26 Sep 2014 Mohamed Elhoseiny, Ahmed Elgammal

In this paper, we present a generalized structured regression framework based on Shama-Mittal divergence, a relative entropy measure, which is introduced to the Machine Learning community in this work.

BIG-bench Machine Learning Gaussian Processes

Toward Automated Discovery of Artistic Influence

no code implementations14 Aug 2014 Babak Saleh, Kanako Abe, Ravneet Singh Arora, Ahmed Elgammal

The contribution of this paper is in exploring the problem of computer-automated suggestion of influences between artists, a problem that was not addressed before in a general setting.

General Classification

Text to Multi-level MindMaps: A Novel Method for Hierarchical Visual Abstraction of Natural Language Text

no code implementations1 Aug 2014 Mohamed Elhoseiny, Ahmed Elgammal

This work firstly introduces MindMap Multilevel Visualization concept which is to jointly visualize and summarize textual information.

Visual-Semantic Scene Understanding by Sharing Labels in a Context Network

no code implementations16 Sep 2013 Ishani Chakraborty, Ahmed Elgammal

We consider the problem of naming objects in complex, natural scenes containing widely varying object appearance and subtly different names.

Data Augmentation Object +1

DISCOMAX: A Proximity-Preserving Distance Correlation Maximization Algorithm

no code implementations11 Jun 2013 Praneeth Vepakomma, Ahmed Elgammal

Our setting is different from subset-selection algorithms where the problem is to choose the best subset of features for regression.

regression

Object-Centric Anomaly Detection by Attribute-Based Reasoning

no code implementations CVPR 2013 Babak Saleh, Ali Farhadi, Ahmed Elgammal

When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting.

Anomaly Detection Attribute +2

MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification

no code implementations CVPR 2013 Amr Bakry, Ahmed Elgammal

Our approach outperforms for the speaker semi-dependent setting by at least 15% of the baseline, and competes in the other two settings.

Lipreading Speaker Identification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.