Search Results for author: Mohammed Bennamoun

Found 120 papers, 39 papers with code

UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation

no code implementations13 Nov 2024 ChengYuan Zhang, Yilin Zhang, Lei Zhu, Deyin Liu, Lin Wu, Bo Li, Shichao Zhang, Mohammed Bennamoun, Farid Boussaid

This paper introduces a novel framework for unified incremental few-shot object detection (iFSOD) and instance segmentation (iFSIS) using the Transformer architecture.

Decoder Few-Shot Object Detection +5

Referring Human Pose and Mask Estimation in the Wild

1 code implementation27 Oct 2024 Bo Miao, Mingtao Feng, Zijie Wu, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

We introduce Referring Human Pose and Mask Estimation (R-HPM) in the wild, where either a text or positional prompt specifies the person of interest in an image.

Decoder

Implicit to Explicit Entropy Regularization: Benchmarking ViT Fine-tuning under Noisy Labels

no code implementations5 Oct 2024 Maria Marrium, Arif Mahmood, Mohammed Bennamoun

Consequently, Noisy Labels Learning (NLL) has become a critical research field for Convolutional Neural Networks (CNNs), though it remains less explored for Vision Transformers (ViTs).

Benchmarking

A Riemannian Approach for Spatiotemporal Analysis and Generation of 4D Tree-shaped Structures

1 code implementation22 Aug 2024 Tahmina Khanam, Hamid Laga, Mohammed Bennamoun, Guanjin Wang, Ferdous Sohel, Farid Boussaid, Guan Wang, Anuj Srivastava

In this paper, we propose a novel mathematical representation of the shape space of such trajectories, a Riemannian metric on that space, and computational tools for fast and accurate spatiotemporal registration and geodesics computation between 4D tree-shaped structures.

Faster Image2Video Generation: A Closer Look at CLIP Image Embedding's Impact on Spatio-Temporal Cross-Attentions

no code implementations27 Jul 2024 Ashkan Taghipour, Morteza Ghahremani, Mohammed Bennamoun, Aref Miri Rekavandi, Zinuo Li, Hamid Laga, Farid Boussaid

This paper investigates the role of CLIP image embeddings within the Stable Video Diffusion (SVD) framework, focusing on their impact on video generation quality and computational efficiency.

Computational Efficiency Video Generation

DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition

1 code implementation6 Jul 2024 Qi Wang, Zhou Xu, Yuming Lin, Jingtao Ye, Hongsheng Li, Guangming Zhu, Syed Afaq Ali Shah, Mohammed Bennamoun, Liang Zhang

By setting a new benchmark in the field, we challenge the current limitations of neuromorphic data processing and invite a surge of new approaches in event-based action recognition techniques, which paves the way for future explorations in neuromorphic computing and beyond.

Action Recognition

Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey

no code implementations28 Jun 2024 Uchitha Rajapaksha, Ferdous Sohel, Hamid Laga, Dean Diepeveen, Mohammed Bennamoun

Estimating depth from single RGB images and videos is of widespread interest due to its applications in many areas, including autonomous driving, 3D reconstruction, digital entertainment, and robotics.

3D Reconstruction Autonomous Driving +2

Supervised Radio Frequency Interference Detection with SNNs

no code implementations10 Jun 2024 Nicholas J. Pritchard, Andreas Wicenec, Mohammed Bennamoun, Richard Dodson

This study underscores the potential of RFI detection as a benchmark problem for SNN researchers, emphasizing the efficacy of SNNs in addressing complex time-series segmentation tasks in radio astronomy.

Astronomy Time Series

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

1 code implementation CVPR 2024 Sajid Javed, Arif Mahmood, Iyyakutti Iyappan Ganapathi, Fayaz Ali Dharejo, Naoufel Werghi, Mohammed Bennamoun

This paper proposes Comprehensive Pathology Language Image Pre-training (CPLIP), a new unsupervised technique designed to enhance the alignment of images and text in histopathology for tasks such as classification and segmentation.

Contrastive Learning Zero-Shot Learning

Language Model Guided Interpretable Video Action Reasoning

no code implementations CVPR 2024 Ning Wang, Guangming Zhu, HS Li, Liang Zhang, Syed Afaq Ali Shah, Mohammed Bennamoun

Extensive experiments on two complex video action datasets, Charades & CAD-120, validates the improved performance and interpretability of our LaIAR framework.

Action Recognition Decision Making +3

Temporally Consistent Referring Video Object Segmentation with Hybrid Memory

1 code implementation28 Mar 2024 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Mubarak Shah, Ajmal Mian

Referring Video Object Segmentation (R-VOS) methods face challenges in maintaining consistent object segmentation due to temporal context variability and the presence of other visually similar objects.

HTR Object +6

Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation

no code implementations2 Mar 2024 Lian Xu, Mohammed Bennamoun, Farid Boussaid, Wanli Ouyang, Ferdous Sohel, Dan Xu

We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from these saliency maps and the significant inter-task correlation between saliency detection and semantic segmentation.

Auxiliary Learning Multi-Label Image Classification +5

Box It to Bind It: Unified Layout Control and Attribute Binding in T2I Diffusion Models

1 code implementation27 Feb 2024 Ashkan Taghipour, Morteza Ghahremani, Mohammed Bennamoun, Aref Miri Rekavandi, Hamid Laga, Farid Boussaid

To address these deficiencies, we introduce the Box-it-to-Bind-it (B2B) module - a novel, training-free approach for improving spatial control and semantic accuracy in text-to-image (T2I) diffusion models.

Attribute

Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review

1 code implementation17 Feb 2024 Thang-Anh-Quan Nguyen, Amine Bourki, Mátyás Macudzinski, Anthony Brunel, Mohammed Bennamoun

This review thoroughly examines the role of semantically-aware Neural Radiance Fields (NeRFs) in visual scene understanding, covering an analysis of over 250 scholarly papers.

Panoptic Segmentation Scene Segmentation +2

RFI Detection with Spiking Neural Networks

1 code implementation24 Nov 2023 Nicholas J. Pritchard, Andreas Wicenec, Mohammed Bennamoun, Richard Dodson

This work demonstrates the viability of SNNs as a promising avenue for machine-learning-based RFI detection in radio telescopes by establishing a minimal performance baseline on traditional and nascent satellite-based RFI sources and is the first work to our knowledge to apply SNNs in astronomy.

Astronomy Semantic Segmentation

HOMOE: A Memory-Based and Composition-Aware Framework for Zero-Shot Learning with Hopfield Network and Soft Mixture of Experts

no code implementations23 Nov 2023 Do Huu Dat, Po Yuan Mao, Tien Hoang Nguyen, Wray Buntine, Mohammed Bennamoun

In our paper, we propose a novel framework that for the first time combines the Modern Hopfield Network with a Mixture of Experts (HOMOE) to classify the compositions of previously unseen objects.

Compositional Zero-Shot Learning

Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art

1 code implementation10 Sep 2023 Aref Miri Rekavandi, Shima Rashidi, Farid Boussaid, Stephen Hoefs, Emre Akbas, Mohammed Bennamoun

Transformers have rapidly gained popularity in computer vision, especially in the field of object recognition and detection.

Object object-detection +2

MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation

1 code implementation6 Aug 2023 Lian Xu, Mohammed Bennamoun, Farid Boussaid, Hamid Laga, Wanli Ouyang, Dan Xu

Building upon the observation that the attended regions of the one-class token in the standard vision transformer can contribute to a class-agnostic localization map, we explore the potential of the transformer model to capture class-specific attention for class-discriminative object localization by learning multiple class tokens.

Object Localization Weakly supervised Semantic Segmentation +1

Spectrum-guided Multi-granularity Referring Video Object Segmentation

1 code implementation ICCV 2023 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

To address the drift problem, we propose a Spectrum-guided Multi-granularity (SgMg) approach, which performs direct segmentation on the encoded features and employs visual details to further optimize the masks.

 Ranked #1 on Referring Expression Segmentation on J-HMDB (using extra training data)

Object Referring Expression Segmentation +4

A Bibliometric Review of Neuromorphic Computing and Spiking Neural Networks

no code implementations14 Apr 2023 Nicholas J. Pritchard, Andreas Wicenec, Mohammed Bennamoun, Richard Dodson

In particular, spiking neural networks hold the potential to advance artificial intelligence as the basis of third-generation neural networks.

Survey

Analysis and Evaluation of Explainable Artificial Intelligence on Suicide Risk Assessment

no code implementations9 Mar 2023 Hao Tang, Aref Miri Rekavandi, Dharjinder Rooprai, Girish Dwivedi, Frank Sanfilippo, Farid Boussaid, Mohammed Bennamoun

This study investigates the effectiveness of Explainable Artificial Intelligence (XAI) techniques in predicting suicide risks and identifying the dominant causes for such behaviours.

Data Augmentation Decision Making +2

VAPCNet: Viewpoint-Aware 3D Point Cloud Completion

no code implementations ICCV 2023 Zhiheng Fu, Longguang Wang, Lian Xu, Zhiyong Wang, Hamid Laga, Yulan Guo, Farid Boussaid, Mohammed Bennamoun

In this paper, we thus propose an unsupervised viewpoint representation learning scheme for 3D point cloud completion without explicit viewpoint estimation.

Point Cloud Completion Representation Learning +1

Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization

no code implementations CVPR 2023 Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu

Weakly supervised dense object localization (WSDOL) relies generally on Class Activation Mapping (CAM), which exploits the correlation between the class weights of the image classifier and the pixel-level features.

Object Localization Representation Learning +2

3D Brain and Heart Volume Generative Models: A Survey

1 code implementation12 Oct 2022 Yanbin Liu, Girish Dwivedi, Farid Boussaid, Mohammed Bennamoun

Generative models such as generative adversarial networks and autoencoders have gained a great deal of attention in the medical field due to their excellent data generation capability.

Denoising Survey

Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo Methods

1 code implementation17 Sep 2022 Laurent Jospin, Allen Antony, Lian Xu, Hamid Laga, Farid Boussaid, Mohammed Bennamoun

In this paper, we propose the Active-Passive SimStereo dataset and a corresponding benchmark to evaluate the performance gap between passive and active stereo images for stereo matching algorithms.

Benchmarking Stereo Matching

Bayesian Learning for Disparity Map Refinement for Semi-Dense Active Stereo Vision

no code implementations12 Sep 2022 Laurent Valentin Jospin, Hamid Laga, Farid Boussaid, Mohammed Bennamoun

A major focus of recent developments in stereo vision has been on how to obtain accurate dense disparity maps in passive stereo vision.

Disparity Estimation

Inflating 2D Convolution Weights for Efficient Generation of 3D Medical Images

no code implementations8 Aug 2022 Yanbin Liu, Girish Dwivedi, Farid Boussaid, Frank Sanfilippo, Makoto Yamada, Mohammed Bennamoun

Novel 3D network architectures are proposed for both the generator and discriminator of the GAN model to significantly reduce the number of parameters while maintaining the quality of image generation.

Image Generation Medical Image Generation

A Guide to Image and Video based Small Object Detection using Deep Learning : Case Study of Maritime Surveillance

no code implementations26 Jul 2022 Aref Miri Rekavandi, Lian Xu, Farid Boussaid, Abd-Krim Seghouane, Stephen Hoefs, Mohammed Bennamoun

Small object detection (SOD) in optical images and videos is a challenging problem that even state-of-the-art generic object detection methods fail to accurately localize and identify such objects.

Decision Making Object +2

Spatial-temporal Analysis for Automated Concrete Workability Estimation

no code implementations24 Jul 2022 Litao Yu, Jian Zhang, Mohammed Bennamoun, Xiaojun Chang, Vute Sirivivatnanon, Ali Nezhad

Concrete workability measure is mostly determined based on subjective assessment of a certified assessor with visual inspections.

regression

Region Aware Video Object Segmentation with Deep Motion Modeling

no code implementations21 Jul 2022 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

Current semi-supervised video object segmentation (VOS) methods usually leverage the entire features of one frame to predict object masks and update memory.

Decoder Object +4

Pseudo-Pair based Self-Similarity Learning for Unsupervised Person Re-identification

no code implementations9 Jul 2022 Lin Wu, Deyin Liu, Wenying Zhang, Dapeng Chen, ZongYuan Ge, Farid Boussaid, Mohammed Bennamoun, Jialie Shen

In this paper, we present a pseudo-pair based self-similarity learning approach for unsupervised person re-ID without human annotations.

Unsupervised Person Re-Identification

Learning Resolution-Adaptive Representations for Cross-Resolution Person Re-Identification

no code implementations9 Jul 2022 Lin Wu, Lingqiao Liu, Yang Wang, Zheng Zhang, Farid Boussaid, Mohammed Bennamoun

It is a challenging and practical problem since the query images often suffer from resolution degradation due to the different capturing conditions from real-world cameras.

Person Re-Identification Super-Resolution

Jacobian Norm with Selective Input Gradient Regularization for Improved and Interpretable Adversarial Defense

no code implementations9 Jul 2022 Deyin Liu, Lin Wu, Haifeng Zhao, Farid Boussaid, Mohammed Bennamoun, Xianghua Xie

Moreover, adversarially training a defense model in general cannot produce interpretable predictions towards the inputs with perturbations, whilst a highly interpretable robust model is required by different domain experts to understand the behaviour of a DNN.

Adversarial Defense

CrossFormer: Cross Spatio-Temporal Transformer for 3D Human Pose Estimation

1 code implementation24 Mar 2022 Mohammed Hassanin, Abdelwahed Khamiss, Mohammed Bennamoun, Farid Boussaid, Ibrahim Radwan

3D human pose estimation can be handled by encoding the geometric dependencies between the body parts and enforcing the kinematic constraints.

3D Human Pose Estimation

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

1 code implementation CVPR 2022 Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu

To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens.

Object Object Localization +2

Unsupervised Learning on 3D Point Clouds by Clustering and Contrasting

no code implementations5 Feb 2022 Guofeng Mei, Litao Yu, Qiang Wu, Jian Zhang, Mohammed Bennamoun

This paper proposes a general unsupervised approach, named \textbf{ConClu}, to perform the learning of point-wise and global features by jointly leveraging point-level clustering and instance-level contrasting.

3D Object Classification Clustering +2

Spatio-Temporal Graph Representation Learning for Fraudster Group Detection

no code implementations7 Jan 2022 Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Then we use an RNN on the spatial relations to predict the spatio-temporal relations of reviewers in the group.

Graph Representation Learning

COTReg:Coupled Optimal Transport based Point Cloud Registration

no code implementations29 Dec 2021 Guofeng Mei, Xiaoshui Huang, Litao Yu, Jian Zhang, Mohammed Bennamoun

Generating a set of high-quality correspondences or matches is one of the most critical steps in point cloud registration.

Point Cloud Registration

Explainable Artificial Intelligence for Pharmacovigilance: What Features Are Important When Predicting Adverse Outcomes?

no code implementations25 Dec 2021 Isaac Ronald Ward, Ling Wang, Juan lu, Mohammed Bennamoun, Girish Dwivedi, Frank M Sanfilippo

Using XAI, we quantified the contribution that specific drugs had on these ACS predictions, thus creating an XAI-based technique for pharmacovigilance monitoring, using ACS as an example of the adverse outcome to detect.

Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +2

Generalized Closed-form Formulae for Feature-based Subpixel Alignment in Patch-based Matching

1 code implementation2 Dec 2021 Laurent Valentin Jospin, Farid Boussaid, Hamid Laga, Mohammed Bennamoun

In this paper, we show that closed form formulae for subpixel disparity computation for the case of one dimensional matching, e. g., in the case of rectified stereo images where the search space is of one dimension, exists when using the standard NCC, SSD and SAD cost functions.

Optical Flow Estimation Patch Matching +1

Social Fraud Detection Review: Methods, Challenges and Analysis

no code implementations10 Nov 2021 Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Many studies proposed approaches based on user behaviors and review text to address the challenges of fraud detection.

Decision Making Fraud Detection

Training Spiking Neural Networks Using Lessons From Deep Learning

3 code implementations27 Sep 2021 Jason K. Eshraghian, Max Ward, Emre Neftci, Xinxin Wang, Gregor Lenz, Girish Dwivedi, Mohammed Bennamoun, Doo Seok Jeong, Wei D. Lu

This paper serves as a tutorial and perspective showing how to apply the lessons learnt from several decades of research in deep learning, gradient descent, backpropagation and neuroscience to biologically plausible spiking neural neural networks.

Deep Learning

Deep Bayesian Image Set Classification: A Defence Approach against Adversarial Attacks

no code implementations23 Aug 2021 Nima Mirnateghi, Syed Afaq Ali Shah, Mohammed Bennamoun

In practice, the vulnerability of deep learning systems against carefully perturbed images, known as adversarial examples, poses a dire security threat in the physical world applications.

Face Recognition Object Recognition +1

Tensor Pooling Driven Instance Segmentation Framework for Baggage Threat Recognition

1 code implementation22 Aug 2021 Taimur Hassan, Samet Akcay, Mohammed Bennamoun, Salman Khan, Naoufel Werghi

Furthermore, to the best of our knowledge, this is the first contour instance segmentation framework that leverages multi-scale information to recognize cluttered and concealed contraband data from the colored and grayscale security X-ray imagery.

Instance Segmentation Segmentation +1

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

1 code implementation27 Jul 2021 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

We propose a self-supervised spatio-temporal matching method, coined Motion-Aware Mask Propagation (MAMP), for video object segmentation.

Segmentation Semantic Segmentation +2

Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation

1 code implementation ICCV 2021 Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel, Dan Xu

Motivated by the significant inter-task correlation, we propose a novel weakly supervised multi-task framework termed as AuxSegNet, to leverage saliency detection and multi-label image classification as auxiliary tasks to improve the primary task of semantic segmentation using only image-level ground-truth labels.

Auxiliary Learning Multi-Label Image Classification +6

A Systematic Collection of Medical Image Datasets for Deep Learning

1 code implementation24 Jun 2021 Johann Li, Guangming Zhu, Cong Hua, Mingtao Feng, BasheerBennamoun, Ping Li, Xiaoyuan Lu, Juan Song, Peiyi Shen, Xu Xu, Lin Mei, Liang Zhang, Syed Afaq Ali Shah, Mohammed Bennamoun

Thus, as comprehensive as possible, this paper provides a collection of medical image datasets with their associated challenges for deep learning research.

Deep Learning Medical Image Analysis

Attack to Fool and Explain Deep Networks

no code implementations20 Jun 2021 Naveed Akhtar, Muhammad A. A. K. Jalwana, Mohammed Bennamoun, Ajmal Mian

Exploring this phenomenon further, we alter the `adversarial' objective of our attack to use it as a tool to `explain' deep visual representation.

Adversarial Attack Image Manipulation

CAMERAS: Enhanced Resolution And Sanity preserving Class Activation Mapping for image saliency

1 code implementation CVPR 2021 Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.

q-RBFNN:A Quantum Calculus-based RBF Neural Network

1 code implementation2 Jun 2021 Syed Saiq Hussain, Muhammad Usman, Taha Hasan Masood Siddique, Imran Naseem, Roberto Togneri, Mohammed Bennamoun

In this research a novel stochastic gradient descent based learning approach for the radial basis function neural networks (RBFNN) is proposed.

4D Atlas: Statistical Analysis of the Spatiotemporal Variability in Longitudinal 3D Shape Data

no code implementations23 Jan 2021 Hamid Laga, Marcel Padilla, Ian H. Jermyn, Sebastian Kurtek, Mohammed Bennamoun, Anuj Srivastava

With this formulation, the statistical analysis of 4D surfaces can be cast as the problem of analyzing trajectories embedded in a nonlinear Riemannian manifold.

LCEval: Learned Composite Metric for Caption Evaluation

1 code implementation24 Dec 2020 Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

Automatic evaluation metrics hold a fundamental importance in the development and fine-grained analysis of captioning systems.

Sentence

WEmbSim: A Simple yet Effective Metric for Image Captioning

no code implementations24 Dec 2020 Naeha Sharif, Lyndon White, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

The area of automatic image caption evaluation is still undergoing intensive research to address the needs of generating captions which can meet adequacy and fluency requirements.

Image Captioning Word Embeddings

SubICap: Towards Subword-informed Image Captioning

no code implementations24 Dec 2020 Naeha Sharif, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

In this work we address this common limitation of IC systems in dealing with rare words in the corpora.

Image Captioning Language Modelling

Imputation of Missing Data with Class Imbalance using Conditional Generative Adversarial Networks

no code implementations1 Dec 2020 Saqib Ejaz Awan, Mohammed Bennamoun, Ferdous Sohel, Frank M Sanfilippo, Girish Dwivedi

State-of-the-art imputation approaches, such as Generative Adversarial Imputation Nets (GAIN), model the distribution of observed data to approximate the missing values.

Imputation Missing Values

A Practical Tutorial on Graph Neural Networks

1 code implementation11 Oct 2020 Isaac Ronald Ward, Jack Joyner, Casey Lickfold, Yulan Guo, Mohammed Bennamoun

Graph neural networks (GNNs) have recently grown in popularity in the field of artificial intelligence (AI) due to their unique ability to ingest relatively unstructured data types as input data.

Multi-Kernel Fusion for RBF Neural Networks

1 code implementation6 Jul 2020 Syed Muhammad Atif, Shujaat Khan, Imran Naseem, Roberto Togneri, Mohammed Bennamoun

A simple yet effective architectural design of radial basis function neural networks (RBFNN) makes them amongst the most popular conventional neural networks.

Orthogonal Deep Models As Defense Against Black-Box Attacks

no code implementations26 Jun 2020 Mohammad A. A. K. Jalwana, Naveed Akhtar, Mohammed Bennamoun, Ajmal Mian

On the other, deep learning has also been found vulnerable to adversarial attacks, which calls for new techniques to defend deep models against these attacks.

DFraud3- Multi-Component Fraud Detection freeof Cold-start

no code implementations10 Jun 2020 Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

In this research, instead of focusing only on one component, detecting either fraud reviews or fraud users (fraudsters), vector representations are learnt for each component, enabling multi-component classification.

Component Classification Fraud Detection +1

A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

no code implementations1 Jun 2020 Hamid Laga, Laurent Valentin Jospin, Farid Boussaid, Mohammed Bennamoun

Motivated by their growing success in solving various 2D and 3D vision problems, deep learning for stereo-based depth estimation has attracted growing interest from the community, with more than 150 papers published in this area between 2014 and 2019.

 Ranked #1 on Monocular Depth Estimation on Make3D (RMSE metric)

Autonomous Driving Deep Learning +3

Efficient Scene Text Detection with Textual Attention Tower

no code implementations30 Jan 2020 Liang Zhang, Yufei Liu, Hang Xiao, Lu Yang, Guangming Zhu, Syed Afaq Shah, Mohammed Bennamoun, Peiyi Shen

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks.

Scene Text Detection Text Detection

Structure-Feature based Graph Self-adaptive Pooling

1 code implementation30 Jan 2020 Liang Zhang, Xudong Wang, Hongsheng Li, Guangming Zhu, Peiyi Shen, Ping Li, Xiaoyuan Lu, Syed Afaq Ali Shah, Mohammed Bennamoun

To solve these problems mentioned above, we propose a novel graph self-adaptive pooling method with the following objectives: (1) to construct a reasonable pooled graph topology, structure and feature information of the graph are considered simultaneously, which provide additional veracity and objectivity in node selection; and (2) to make the pooled nodes contain sufficiently effective graph information, node feature information is aggregated before discarding the unimportant nodes; thus, the selected nodes contain information from neighbor nodes, which can enhance the use of features of the unselected nodes.

Graph Classification

Deep Learning for 3D Point Clouds: A Survey

3 code implementations27 Dec 2019 Yulan Guo, Hanyun Wang, Qingyong Hu, Hao liu, Li Liu, Mohammed Bennamoun

To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds.

3D Object Detection 3D Shape Classification +5

Biometrics Recognition Using Deep Learning: A Survey

1 code implementation30 Nov 2019 Shervin Minaee, Amirali Abdolrashidi, Hang Su, Mohammed Bennamoun, David Zhang

Deep learning-based models have been very successful in achieving state-of-the-art results in many of the computer vision, speech recognition, and natural language processing tasks in the last few years.

Deep Learning Gait Recognition +3

RGB-D image-based Object Detection: from Traditional Methods to Deep Learning Techniques

no code implementations22 Jul 2019 Isaac Ronald Ward, Hamid Laga, Mohammed Bennamoun

Deep learning techniques, coupled with the availability of large training datasets, have now revolutionized the field of computer vision, including RGB-D object detection, achieving an unprecedented level of performance.

Medical Diagnosis Object +3

Automatic Hierarchical Classification of Kelps using Deep Residual Features

no code implementations26 Jun 2019 Ammar Mahmood, Ana Giraldo Ospina, Mohammed Bennamoun, Senjian An, Ferdous Sohel, Farid Boussaid, Renae Hovey, Robert B. Fisher, Gary Kendrick

Across the globe, remote image data is rapidly being collected for the assessment of benthic communities from shallow to extremely deep waters on continental slopes to the abyssal seas.

Binary Classification Classification +1

Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era

no code implementations15 Jun 2019 Xian-Feng Han, Hamid Laga, Mohammed Bennamoun

Given this new era of rapid evolution, this article provides a comprehensive survey of the recent developments in this field.

3D Object Reconstruction 3D Reconstruction +1

Label Universal Targeted Attack

no code implementations27 May 2019 Naveed Akhtar, Mohammad A. A. K. Jalwana, Mohammed Bennamoun, Ajmal Mian

We introduce Label Universal Targeted Attack (LUTA) that makes a deep model predict a label of attacker's choice for `any' sample of a given source class with high probability.

A Novel Adaptive Kernel for the RBF Neural Networks

no code implementations9 May 2019 Shujaat Khan, Imran Naseem, Roberto Togneri, Mohammed Bennamoun

In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks.

General Classification

Improving Image-Based Localization with Deep Learning: The Impact of the Loss Function

no code implementations28 Apr 2019 Isaac Ronald Ward, M. A. Asim K. Jalwana, Mohammed Bennamoun

This work investigates the impact of the loss function on the performance of Neural Networks, in the context of a monocular, RGB-only, image localization task.

Image-Based Localization regression

Optical Flow Techniques for Facial Expression Analysis -- a Practical Evaluation Study

no code implementations25 Apr 2019 Benjamin Allaert, Isaac Ronald Ward, Ioan Marius Bilasco, Chaabane Djeraba, Mohammed Bennamoun

Optical flow techniques are becoming increasingly performant and robust when estimating motion in a scene, but their performance has yet to be proven in the area of facial expression recognition.

Data Augmentation Facial Expression Recognition +2

Imaging and Classification Techniques for Seagrass Mapping and Monitoring: A Comprehensive Survey

no code implementations26 Feb 2019 Md Moniruzzaman, S. M. Shamsul Islam, Paul Lavery, Mohammed Bennamoun, C. Peng Lam

The detection and mapping of underwater vegetation, especially seagrass has drawn the attention of the research community as early as the nineteen eighties.

General Classification

Attention in Convolutional LSTM for Gesture Recognition

1 code implementation NeurIPS 2018 Liang Zhang, Guangming Zhu, Lin Mei, Peiyi Shen, Syed Afaq Ali Shah, Mohammed Bennamoun

On this basis, a new variant of LSTM is derived, in which the convolutional structures are only embedded into the input-to-state transition of LSTM.

Gesture Recognition

RAFP-Pred: Robust Prediction of Antifreeze Proteins using Localized Analysis of n-Peptide Compositions

no code implementations25 Sep 2018 Shujaat Khan, Imran Naseem, Roberto Togneri, Mohammed Bennamoun

In extreme cold weather, living organisms produce Antifreeze Proteins (AFPs) to counter the otherwise lethal intracellular formation of ice.

Specificity

NNEval: Neural Network based Evaluation Metric for Image Captioning

no code implementations ECCV 2018 Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah

The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems.

Image Captioning Sentence

DataDeps.jl: Repeatable Data Setup for Replicable Data Science

2 code implementations3 Aug 2018 Lyndon White, Roberto Togneri, Wei Liu, Mohammed Bennamoun

We present DataDeps. jl: a julia package for the reproducible handling of static datasets to enhance the repeatability of scripts used in the data and computational sciences.

Software Engineering

Learning-based Composite Metrics for Improved Caption Evaluation

no code implementations ACL 2018 Naeha Sharif, Lyndon White, Mohammed Bennamoun, Syed Afaq Ali Shah

The evaluation of image caption quality is a challenging task, which requires the assessment of two main aspects in a caption: adequacy and fluency.

Image Captioning Language Modelling +2

NovelPerspective: Identifying Point of View Characters

1 code implementation ACL 2018 Lyndon White, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Our tool detects the main character that each section is from the POV of, and allows the user to generate a new ebook with only those sections.

Named Entity Recognition (NER)

Exploiting Layerwise Convexity of Rectifier Networks with Sign Constrained Weights

no code implementations14 Nov 2017 Senjian An, Farid Boussaid, Mohammed Bennamoun, Ferdous Sohel

By introducing sign constraints on the weights, this paper proposes sign constrained rectifier networks (SCRNs), whose training can be solved efficiently by the well known majorization-minimization (MM) algorithms.

Learning Action Recognition Model From Depth and Skeleton Videos

no code implementations ICCV 2017 Hossein Rahmani, Mohammed Bennamoun

Depth sensors open up possibilities of dealing with the human action recognition problem by providing 3D human skeleton data and depth images of the scene.

Action Recognition Human-Object Interaction Detection +1

On the Compressive Power of Deep Rectifier Networks for High Resolution Representation of Class Boundaries

no code implementations24 Aug 2017 Senjian An, Mohammed Bennamoun, Farid Boussaid

To show the superior compressive power of deep rectifier networks over shallow rectifier networks, we prove that the maximum boundary resolution of a single hidden layer rectifier network classifier grows exponentially with the number of units when this number is smaller than the dimension of the patterns.

General Classification

From Deep to Shallow: Transformations of Deep Rectifier Networks

no code implementations30 Mar 2017 Senjian An, Farid Boussaid, Mohammed Bennamoun, Jiankun Hu

Similarly, for a residual net and a conventional rectifier net with the same structure except for the skip connections in the residual net, the corresponding single hidden layer representation of the residual net is much more complex than the corresponding single hidden layer representation of the conventional net.

ResFeats: Residual Network Based Features for Image Classification

no code implementations21 Nov 2016 Ammar Mahmood, Mohammed Bennamoun, Senjian An, Ferdous Sohel

Deep residual networks have recently emerged as the state-of-the-art architecture in image segmentation and object detection.

Classification Dimensionality Reduction +8

Leveraging Structural Context Models and Ranking Score Fusion for Human Interaction Prediction

no code implementations18 Aug 2016 Qiuhong Ke, Mohammed Bennamoun, Senjian An, Farid Bossaid, Ferdous Sohel

The structural models, including the spatial and the temporal models, are learned with Long Short Term Memory (LSTM) networks to capture the dependency of the global and local contexts of each RGB frame and each optical flow image, respectively.

Optical Flow Estimation

Learning deep structured network for weakly supervised change detection

no code implementations7 Jun 2016 Salman H. Khan, Xuming He, Fatih Porikli, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

We apply a constrained mean-field algorithm to estimate the pixel-level labels, and use the estimated labels to update the parameters of the CNN in an iterative EM framework.

Change Detection

Contractive Rectifier Networks for Nonlinear Maximum Margin Classification

no code implementations ICCV 2015 Senjian An, Munawar Hayat, Salman H. Khan, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel

The contractive constraints ensure that the achieved separating margin in the input space is larger than or equal to the separating margin in the output layer.

Classification General Classification

A Spatial Layout and Scale Invariant Feature Representation for Indoor Scene Classification

no code implementations18 Jun 2015 Munawar Hayat, Salman H. Khan, Mohammed Bennamoun, Senjian An

This paper introduces a new learnable feature descriptor called "spatial layout and scale invariant convolutional activations" to deal with these challenges.

General Classification Scene Classification

Separating Objects and Clutter in Indoor Scenes

no code implementations CVPR 2015 Salman H. Khan, Xuming He, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

Objects' spatial layout estimation and clutter identification are two important tasks to understand indoor scenes.

Automatic Feature Learning for Robust Shadow Detection

no code implementations CVPR 2014 Salman Hameed Khan, Mohammed Bennamoun, Ferdous Sohel, Roberto Togneri

We present a practical framework to automatically detect shadows in real world scenes from a single photograph.

Shadow Detection

Rotational Projection Statistics for 3D Local Surface Description and Object Recognition

no code implementations11 Apr 2013 Yulan Guo, Ferdous Sohel, Mohammed Bennamoun, Min Lu, Jianwei Wan

The performance of the proposed LRF, RoPS descriptor and object recognition algorithm was rigorously tested on a number of popular and publicly available datasets.

3D Object Recognition Object

Cannot find the paper you are looking for? You can Submit a new open access paper.