1 code implementation • 6 Oct 2023 • Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu
The key challenge of designing foundation models on KGs is to learn such transferable representations that enable inference on any graph with arbitrary entity and relation vocabularies.
1 code implementation • 18 Dec 2021 • Shengyu Feng, Subarna Tripathi, Hesham Mostafa, Marcel Nassar, Somdeb Majumdar
Dynamic scene graph generation from a video is challenging due to the temporal dynamics of the scene and the inherent temporal fluctuations of predictions.
1 code implementation • NeurIPS 2021 • Sami Abu-El-Haija, Hesham Mostafa, Marcel Nassar, Valentino Crespi, Greg Ver Steeg, Aram Galstyan
Recent improvements in the performance of state-of-the-art (SOTA) methods for Graph Representational Learning (GRL) have come at the cost of significant computational resource requirements for training, e. g., for calculating gradients via backprop over many data epochs.
1 code implementation • 11 Nov 2021 • Hesham Mostafa
We present the Sequential Aggregation and Rematerialization (SAR) scheme for distributed full-batch training of Graph Neural Networks (GNNs) on large graphs.
no code implementations • 6 Jun 2021 • Hesham Mostafa, Marcel Nassar, Somdeb Majumdar
We also show that homophily is a poor measure of the information in a node's local neighborhood and propose the Neighborhood Information Content(NIC) metric, which is a novel information-theoretic graph metric.
no code implementations • 17 Dec 2020 • Souvik Kundu, Hesham Mostafa, Sharath Nittur Sridhar, Sairam Sundaresan
Convolutional layers are an integral part of many deep neural network solutions in computer vision.
no code implementations • 2 Mar 2020 • Hesham Mostafa, Marcel Nassar
The attention coefficients depend on the Euclidean distance between learnable node embeddings, and we show that the resulting attention-based global aggregation scheme is analogous to high-dimensional Gaussian filtering.
no code implementations • 30 Dec 2019 • Hesham Mostafa
We propose a novel representation matching scheme that reduces the divergence of local models by ensuring the feature representations in the global (aggregate) model can be derived from the locally learned representations.
1 code implementation • 16 Jul 2019 • Mark D. McDonnell, Hesham Mostafa, Runchun Wang, Andre van Schaik
We found, following experiments with wide residual networks applied to the ImageNet, CIFAR 10 and CIFAR 100 image classification datasets, that BN layers do not consistently offer a significant advantage.
Ranked #94 on Image Classification on CIFAR-100 (using extra training data)
no code implementations • 15 Feb 2019 • Hesham Mostafa, Xin Wang
We evaluate the performance of dynamic reallocation methods in training deep convolutional networks and show that our method outperforms previous static and dynamic reparameterization methods, yielding the best accuracy for a fixed parameter budget, on par with accuracies obtained by iteratively pruning a pre-trained dense model.
4 code implementations • 28 Jan 2019 • Emre O. Neftci, Hesham Mostafa, Friedemann Zenke
Spiking neural networks are nature's versatile solution to fault-tolerant and energy efficient signal processing.
3 code implementations • 27 Nov 2018 • Jacques Kaiser, Hesham Mostafa, Emre Neftci
A relatively smaller body of work, however, discusses similarities between learning dynamics employed in deep artificial neural networks and synaptic plasticity in spiking neural networks.
no code implementations • NIPS Workshop CDNNRIA 2018 • Hesham Mostafa, Xin Wang
Network pruning has emerged as a powerful technique for reducing the size of deep neural networks.
no code implementations • 17 Nov 2017 • Hesham Mostafa, Vishwajith Ramesh, Gert Cauwenberghs
Updating the features or weights in one layer, however, requires waiting for the propagation of error signals from higher layers.
no code implementations • 14 Aug 2017 • Hesham Mostafa, Gert Cauwenberghs
This allows us to use the proposed networks in a variational learning setting where stochastic backpropagation is used to optimize a lower bound on the data log likelihood, thereby learning a generative model of the data.
no code implementations • 15 Jun 2017 • Hesham Mostafa, Bruno Pedroni, Sadique Sheik, Gert Cauwenberghs
In this paper, we describe a hardware-efficient on-line learning technique for feedforward multi-layer ANNs that is based on pipelined backpropagation.
no code implementations • 5 Jun 2017 • Alessandro Aimar, Hesham Mostafa, Enrico Calabrese, Antonio Rios-Navarro, Ricardo Tapiador-Morales, Iulia-Alexandra Lungu, Moritz B. Milde, Federico Corradi, Alejandro Linares-Barranco, Shih-Chii Liu, Tobi Delbruck
By exploiting sparsity, NullHop achieves an efficiency of 368%, maintains over 98% utilization of the MAC units, and achieves a power efficiency of over 3TOp/s/W in a core area of 6. 3mm$^2$.
1 code implementation • 27 Jun 2016 • Hesham Mostafa
Gradient descent training techniques are remarkably successful in training analog-valued artificial neural networks (ANNs).
no code implementations • 9 Dec 2015 • Hesham Mostafa, Giacomo Indiveri
We show that stochastic artificial neurons can be realized on silicon chips by exploiting the quasi-periodic behavior of mismatched analog oscillators to approximate the neuron's stochastic activation function.
no code implementations • NeurIPS 2013 • Hesham Mostafa, Lorenz. K. Mueller, Giacomo Indiveri
If there is no solution that satisfies all constraints, the network state changes in a pseudo-random manner and its trajectory approximates a sampling procedure that selects a variable assignment with a probability that increases with the fraction of constraints satisfied by this assignment.