Search Results for author: Fahad Shahbaz Khan

Found 122 papers, 86 papers with code

Fixing Localization Errors to Improve Image Classification

1 code implementation ECCV 2020 Guolei Sun, Salman Khan, Wen Li, Hisham Cholakkal, Fahad Shahbaz Khan, Luc van Gool

This way, in an effort to fix localization errors, our loss provides an extra supervisory signal that helps the model to better discriminate between similar classes.

Classification General Classification +3

Count- and Similarity-aware R-CNN for Pedestrian Detection

no code implementations ECCV 2020 Jin Xie, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Mubarak Shah

We further introduce a count-and-similarity branch within the two-stage detection framework, which predicts pedestrian count as well as proposal similarity.

Human Instance Segmentation Pedestrian Detection +1

Guidance Through Surrogate: Towards a Generic Diagnostic Attack

no code implementations30 Dec 2022 Muzammal Naseer, Salman Khan, Fatih Porikli, Fahad Shahbaz Khan

Recently, different adversarial training defenses are proposed that not only maintain a high clean accuracy but also show significant robustness against popular and well studied adversarial attacks such as PGD.

Adversarial Robustness

Fine-tuned CLIP Models are Efficient Video Learners

1 code implementation6 Dec 2022 Hanoona Rasheed, Muhammad Uzair Khattak, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Since training on a similar scale for videos is infeasible, recent approaches focus on the effective transfer of image-based CLIP to the video domain.

Lightning Fast Video Anomaly Detection via Adversarial Knowledge Distillation

no code implementations28 Nov 2022 Nicolae-Catalin Ristea, Florinel-Alin Croitoru, Dana Dascalescu, Radu Tudor Ionescu, Fahad Shahbaz Khan, Mubarak Shah

We propose a very fast frame-level model for anomaly detection in video, which learns to detect anomalies by distilling knowledge from multiple highly accurate object-level teacher models.

Anomaly Detection Knowledge Distillation +1

Person Image Synthesis via Denoising Diffusion Model

1 code implementation22 Nov 2022 Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Jorma Laaksonen, Mubarak Shah, Fahad Shahbaz Khan

In this work, we show how denoising diffusion models can be applied for high-fidelity person image synthesis with strong sample diversity and enhanced mode coverage of the learnt data distribution.

Denoising Image Generation

MaPLe: Multi-modal Prompt Learning

1 code implementation Technical Report 2022 Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan

Pre-trained vision-language (V-L) models such as CLIP have shown excellent generalization ability to downstream tasks.

Prompt Engineering

Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection

1 code implementation25 Sep 2022 Neelu Madan, Nicolae-Catalin Ristea, Radu Tudor Ionescu, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, Mubarak Shah

In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, as well as a transformer for channel-wise attention.

Anomaly Detection Event Detection +1

CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection

no code implementations13 Sep 2022 Dhanalaxmi Gaddam, Jean Lahoud, Fahad Shahbaz Khan, Rao Muhammad Anwer, Hisham Cholakkal

In this work, we propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework, which takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene at multiple levels to predict a set of object bounding-boxes along with their corresponding semantic labels.

3D Object Detection Object Counting +1

Transformers in Remote Sensing: A Survey

no code implementations2 Sep 2022 Abdulaziz Amer Aleissaee, Amandeep Kumar, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal, Gui-Song Xia, Fahad Shahbaz Khan

Deep learning-based algorithms have seen a massive popularity in different areas of remote sensing image analysis over the past decade.

AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility

1 code implementation14 Aug 2022 Mubashir Noman, Wafa Al Ghallabi, Daniya Najiha, Christoph Mayer, Akshay Dudhane, Martin Danelljan, Hisham Cholakkal, Salman Khan, Luc van Gool, Fahad Shahbaz Khan

While being greatly benefiting to the tracking research, existing benchmarks do not pose the same difficulty as before with recent trackers achieving higher performance mainly due to (i) the introduction of more sophisticated transformers-based methods and (ii) the lack of diverse scenarios with adverse visibility such as, severe weather conditions, camouflage and imaging effects.

Visual Object Tracking Visual Tracking

Multi-scale Feature Aggregation for Crowd Counting

no code implementations10 Aug 2022 Xiaoheng Jiang, Xinyi Wu, Hisham Cholakkal, Rao Muhammad Anwer, Jiale Cao Mingliang Xu, Bing Zhou, Yanwei Pang, Fahad Shahbaz Khan

The SkipAgg module directly propagates features with small receptive fields to features with much larger receptive fields.

Crowd Counting

3D Vision with Transformers: A Survey

1 code implementation8 Aug 2022 Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang

The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field.

Pose Estimation

Self-Distilled Vision Transformer for Domain Generalization

1 code implementation25 Jul 2022 Maryam Sultana, Muzammal Naseer, Muhammad Haris Khan, Salman Khan, Fahad Shahbaz Khan

Similar to CNNs, ViTs also struggle in out-of-distribution scenarios and the main culprit is overfitting to source domains.

Domain Generalization

Adversarial Pixel Restoration as a Pretext Task for Transferable Perturbations

1 code implementation18 Jul 2022 Hashmat Shadab Malik, Shahina K Kunhimon, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

Our training approach is based on a min-max scheme which reduces overfitting via an adversarial objective and thus optimizes for a more generalizable surrogate model.

object-detection Object Detection +2

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning

1 code implementation5 Jul 2022 Mamshad Nayeem Rizve, Navid Kardan, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

In the open-world SSL problem, the objective is to recognize samples of known classes, and simultaneously detect and cluster samples belonging to novel classes present in unlabeled data.

Open-World Semi-Supervised Learning

EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications

4 code implementations21 Jun 2022 Muhammad Maaz, Abdelrahman Shaker, Hisham Cholakkal, Salman Khan, Syed Waqas Zamir, Rao Muhammad Anwer, Fahad Shahbaz Khan

Our EdgeNeXt model with 1. 3M parameters achieves 71. 2% top-1 accuracy on ImageNet-1K, outperforming MobileViT with an absolute gain of 2. 2% with 28% reduction in FLOPs.

Image Classification Object Detection +1

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution Single Image Super Resolution

Self-Supervised Video Object Segmentation via Cutout Prediction and Tagging

no code implementations22 Apr 2022 Jyoti Kini, Fahad Shahbaz Khan, Salman Khan, Mubarak Shah

We propose a novel self-supervised Video Object Segmentation (VOS) approach that strives to achieve better object-background discriminability for accurate object segmentation.

Semantic Segmentation TAG +2

Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution

1 code implementation8 Apr 2022 Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Andreea-Iuliana Miron, Olivian Savencu, Nicolae-Catalin Ristea, Nicolae Verga, Fahad Shahbaz Khan

Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple concatenated input tensors, where the kernel (receptive field) size controls the reduction rate of the spatial attention, and the number of convolutional filters controls the reduction rate of the channel attention, respectively.

Computed Tomography (CT) Image Super-Resolution

PSTR: End-to-End One-Step Person Search With Transformers

1 code implementation CVPR 2022 Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Shahbaz Khan

We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture.

Human Detection Person Search

Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer

1 code implementation24 Mar 2022 Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan

When using the ResNet50 backbone, our MS-STS achieves a mask AP of 50. 1 %, outperforming the best reported results in literature by 2. 7 % and by 4. 8 % at higher overlap threshold of AP_75, while being comparable in model size and speed on Youtube-VIS 2019 val.

Instance Segmentation Semantic Segmentation +2

SepTr: Separable Transformer for Audio Spectrogram Processing

1 code implementation17 Mar 2022 Nicolae-Catalin Ristea, Radu Tudor Ionescu, Fahad Shahbaz Khan

Following the successful application of vision transformers in multiple computer vision tasks, these models have drawn the attention of the signal processing community.

Audio Classification Speech Emotion Recognition +1

Transformers in Medical Imaging: A Survey

1 code implementation24 Jan 2022 Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat, Fahad Shahbaz Khan, Huazhu Fu

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators.

Image Classification Image Segmentation +6

DoodleFormer: Creative Sketch Drawing with Transformers

1 code implementation6 Dec 2021 Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laaksonen, Michael Felsberg

Creative sketch image generation is a challenging vision problem, where the task is to generate diverse, yet realistic creative sketches possessing the unseen composition of the visual-world objects.

Image Generation

OW-DETR: Open-world Detection Transformer

2 code implementations CVPR 2022 Akshita Gupta, Sanath Narayan, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

In the case of incremental object detection, OW-DETR outperforms the state-of-the-art for all settings on PASCAL VOC.

Inductive Bias object-detection +2

Self-supervised Video Transformer

1 code implementation CVPR 2022 Kanchana Ranasinghe, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Michael Ryoo

To the best of our knowledge, the proposed approach is the first to alleviate the dependency on negative samples or dedicated memory banks in Self-supervised Video Transformer (SVT).

Action Classification Action Recognition +1

Restormer: Efficient Transformer for High-Resolution Image Restoration

10 code implementations CVPR 2022 Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang

Since convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data, these models have been extensively applied to image restoration and related tasks.

Color Image Denoising Deblurring +6

Burst Image Restoration and Enhancement

1 code implementation CVPR 2022 Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan, Ming-Hsuan Yang

Our central idea is to create a set of pseudo-burst features that combine complementary information from all the input burst frames to seamlessly exchange information.

Burst Image Super-Resolution Denoising +3

Dense Gaussian Processes for Few-Shot Segmentation

1 code implementation7 Oct 2021 Joakim Johnander, Johan Edstedt, Michael Felsberg, Fahad Shahbaz Khan, Martin Danelljan

Given the support set, our dense GP learns the mapping from local deep image features to mask values, capable of capturing complex appearance distributions.

Few-Shot Semantic Segmentation Gaussian Processes

Discriminative Region-based Multi-Label Zero-Shot Learning

1 code implementation ICCV 2021 Sanath Narayan, Akshita Gupta, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Mubarak Shah

We note that the best existing multi-label ZSL method takes a shared approach towards attending to region features with a common set of attention maps for all the classes.

Image Retrieval Multi-label zero-shot learning

Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains

no code implementations15 Jul 2021 Puneet Mangla, Shivam Chandhok, Vineeth N Balasubramanian, Fahad Shahbaz Khan

Recent progress towards designing models that can generalize to unseen domains (i. e domain generalization) or unseen classes (i. e zero-shot learning) has embarked interest towards building models that can tackle both domain-shift and semantic shift simultaneously (i. e zero-shot domain generalization).

Domain Generalization Zero-Shot Learning +1

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains

no code implementations12 Jul 2021 Shivam Chandhok, Sanath Narayan, Hisham Cholakkal, Rao Muhammad Anwer, Vineeth N Balasubramanian, Fahad Shahbaz Khan, Ling Shao

The need to address the scarcity of task-specific annotated data has resulted in concerted efforts in recent years for specific settings such as zero-shot learning (ZSL) and domain generalization (DG), to separately address the issues of semantic shift and domain shift, respectively.

Domain Generalization Zero-Shot Learning +1

On Improving Adversarial Transferability of Vision Transformers

2 code implementations ICLR 2022 Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Fahad Shahbaz Khan, Fatih Porikli

(ii) Token Refinement: We then propose to refine the tokens to further enhance the discriminative capacity at each block of ViT.

Adversarial Attack

Intriguing Properties of Vision Transformers

1 code implementation NeurIPS 2021 Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang

We show and analyze the following intriguing properties of ViT: (a) Transformers are highly robust to severe occlusions, perturbations and domain shifts, e. g., retain as high as 60% top-1 accuracy on ImageNet even after randomly occluding 80% of the image content.

Few-Shot Learning Semantic Segmentation

MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains

1 code implementation28 Apr 2021 Yaxing Wang, Abel Gonzalez-Garcia, Chenshen Wu, Luis Herranz, Fahad Shahbaz Khan, Shangling Jui, Joost Van de Weijer

Therefore, we propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs.

Transfer Learning

Rich Semantics Improve Few-shot Learning

1 code implementation26 Apr 2021 Mohamed Afham, Salman Khan, Muhammad Haris Khan, Muzammal Naseer, Fahad Shahbaz Khan

Human learning benefits from multi-modal inputs that often appear as rich semantics (e. g., description of an object's attributes while learning about it).

 Ranked #1 on Few-Shot Image Classification on Oxford 102 Flower (using extra training data)

Few-Shot Image Classification

Handwriting Transformers

1 code implementation ICCV 2021 Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Mubarak Shah

We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns.

Image Generation Text Generation

Deep Gaussian Processes for Few-Shot Segmentation

no code implementations30 Mar 2021 Joakim Johnander, Johan Edstedt, Martin Danelljan, Michael Felsberg, Fahad Shahbaz Khan

Through the expressivity of the GP, our approach is capable of modeling complex appearance distributions in the deep feature space.

Gaussian Processes

On Generating Transferable Targeted Perturbations

3 code implementations ICCV 2021 Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Fatih Porikli

To this end, we propose a new objective function that not only aligns the global distributions of source and target images, but also matches the local neighbourhood structure between the two domains.

Orthogonal Projection Loss

1 code implementation ICCV 2021 Kanchana Ranasinghe, Muzammal Naseer, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan

The CE loss encourages features of a class to have a higher projection score on the true class-vector compared to the negative classes.

Domain Generalization Few-Shot Learning

Multi-Stage Progressive Image Restoration

7 code implementations CVPR 2021 Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

At each stage, we introduce a novel per-pixel adaptive design that leverages in-situ supervised attention to reweight the local features.

Deblurring Image Deblurring +3

Transformers in Vision: A Survey

no code implementations4 Jan 2021 Salman Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, Mubarak Shah

Astounding results from Transformer models on natural language tasks have intrigued the vision community to study their application to computer vision problems.

Action Recognition Colorization +10

Low Light Image Enhancement via Global and Local Context Modeling

no code implementations4 Jan 2021 Aditya Arora, Muhammad Haris, Syed Waqas Zamir, Munawar Hayat, Fahad Shahbaz Khan, Ling Shao, Ming-Hsuan Yang

These contexts can be crucial towards inferring several image enhancement tasks, e. g., local and global contrast, brightness and color corrections; which requires cues from both local and global spatial extent.

Low-Light Image Enhancement

Learning to Fuse Asymmetric Feature Maps in Siamese Trackers

1 code implementation CVPR 2021 Wencheng Han, Xingping Dong, Fahad Shahbaz Khan, Ling Shao, Jianbing Shen

We propose a learnable module, called the asymmetric convolution (ACM), which learns to better capture the semantic correlation information in offline training on large-scale data.

Visual Object Tracking Visual Tracking

Anomaly Detection in Video via Self-Supervised and Multi-Task Learning

1 code implementation CVPR 2021 Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, Mubarak Shah

To the best of our knowledge, we are the first to approach anomalous event detection in video as a multi-task learning problem, integrating multiple self-supervised and knowledge distillation proxy tasks in a single architecture.

Abnormal Event Detection In Video Anomaly Detection In Surveillance Videos +4

Meta-learning the Learning Trends Shared Across Tasks

no code implementations19 Oct 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

This demonstrates their ability to acquire transferable knowledge, a capability that is central to human learning.

Meta-Learning

Synthesizing the Unseen for Zero-shot Object Detection

1 code implementation19 Oct 2020 Nasir Hayat, Munawar Hayat, Shafin Rahman, Salman Khan, Syed Waqas Zamir, Fahad Shahbaz Khan

The existing zero-shot detection approaches project visual features to the semantic domain for seen objects, hoping to map unseen objects to their corresponding semantics during inference.

Generalized Zero-Shot Object Detection Zero-Shot Object Detection

From Handcrafted to Deep Features for Pedestrian Detection: A Survey

2 code implementations1 Oct 2020 Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao

In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance.

Pedestrian Detection

A Background-Agnostic Framework with Adversarial Training for Abnormal Event Detection in Video

2 code implementations27 Aug 2020 Mariana-Iuliana Georgescu, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, Mubarak Shah

Following the standard formulation of abnormal event detection as outlier detection, we propose a background-agnostic framework that learns from training videos containing only normal events.

Abnormal Event Detection In Video Anomaly Detection In Surveillance Videos +2

Image Colorization: A Survey and Dataset

1 code implementation25 Aug 2020 Saeed Anwar, Muhammad Tahir, Chongyi Li, Ajmal Mian, Fahad Shahbaz Khan, Abdul Wahab Muzaffar

Image colorization is the process of estimating RGB colors for grayscale images or video frames to improve their aesthetic and perceptual quality.

Colorization Image Colorization

Stylized Adversarial Defense

1 code implementation29 Jul 2020 Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Fatih Porikli

In contrast to existing adversarial training methods that only use class-boundary information (e. g., using a cross-entropy loss), we propose to exploit additional information from the feature space to craft stronger adversaries that are in turn used to learn a robust model.

Adversarial Defense

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

1 code implementation ECCV 2020 Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao

In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3. 0% (mask AP) under similar settings, while operating at comparable speed on a Titan Xp.

object-detection Object Detection +3

Self-supervised Knowledge Distillation for Few-shot Learning

1 code implementation17 Jun 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

Our experiments show that, even in the first stage, self-supervision can outperform current state-of-the-art methods, with further gains achieved by our second stage distillation process.

Few-Shot Image Classification Knowledge Distillation +1

A Self-supervised Approach for Adversarial Robustness

2 code implementations CVPR 2020 Muzammal Naseer, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Fatih Porikli

Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems e. g., for classification, segmentation and object detection.

Adversarial Robustness General Classification +3

Learning Human-Object Interaction Detection using Interaction Points

1 code implementation CVPR 2020 Tiancai Wang, Tong Yang, Martin Danelljan, Fahad Shahbaz Khan, Xiangyu Zhang, Jian Sun

Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them.

Human-Object Interaction Detection Keypoint Detection +1

iTAML: An Incremental Task-Agnostic Meta-learning Approach

1 code implementation CVPR 2020 Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah

In this paper, we hypothesize this problem can be avoided by learning a set of generalized parameters, that are neither specific to old nor new tasks.

Incremental Learning Meta-Learning

CycleISP: Real Image Restoration via Improved Data Synthesis

8 code implementations CVPR 2020 Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

This is mainly because the AWGN is not adequate for modeling the real camera noise which is signal-dependent and heavily transformed by the camera imaging pipeline.

Ranked #9 on Image Denoising on DND (using extra training data)

Image Denoising Image Restoration

Any-Shot Object Detection

no code implementations16 Mar 2020 Shafin Rahman, Salman Khan, Nick Barnes, Fahad Shahbaz Khan

Any-shot detection offers unique challenges compared to conventional novel object detection such as, a high imbalance between unseen, few-shot and seen object classes, susceptibility to forget base-training while learning novel classes and distinguishing novel classes from the background.

object-detection Object Detection

Learning Enriched Features for Real Image Restoration and Enhancement

12 code implementations ECCV 2020 Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing.

Image Denoising Image Enhancement +2

PSC-Net: Learning Part Spatial Co-occurrence for Occluded Pedestrian Detection

no code implementations25 Jan 2020 Jin Xie, Yanwei Pang, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao

On the heavy occluded (\textbf{HO}) set of CityPerosns test set, our PSC-Net obtains an absolute gain of 4. 0\% in terms of log-average miss rate over the state-of-the-art with same backbone, input scale and without using additional VBB supervision.

Pedestrian Detection

Fine-grained Recognition: Accounting for Subtle Differences between Similar Classes

no code implementations14 Dec 2019 Guolei Sun, Hisham Cholakkal, Salman Khan, Fahad Shahbaz Khan, Ling Shao

The main requisite for fine-grained recognition task is to focus on subtle discriminative details that make the subordinate classes different from each other.

Fine-Grained Image Classification

Towards Partial Supervision for Generic Object Counting in Natural Scenes

1 code implementation13 Dec 2019 Hisham Cholakkal, Guolei Sun, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Luc van Gool

Our RLC framework further reduces the annotation cost arising from large numbers of object categories in a dataset by only using lower-count supervision for a subset of categories and class-labels for the remaining ones.

Image Classification Image-level Supervised Instance Segmentation +2

MineGAN: effective knowledge transfer from GANs to target domains with few images

2 code implementations CVPR 2020 Yaxing Wang, Abel Gonzalez-Garcia, David Berga, Luis Herranz, Fahad Shahbaz Khan, Joost Van de Weijer

We propose a novel knowledge transfer method for generative models based on mining the knowledge that is most beneficial to a specific target domain, either from a single or multiple pretrained GANs.

Transfer Learning

Random Path Selection for Continual Learning

1 code implementation NeurIPS 2019 Jathushan Rajasegaran, Munawar Hayat, Salman H. Khan, Fahad Shahbaz Khan, Ling Shao

In order to maintain an equilibrium between previous and newly acquired knowledge, we propose a simple controller to dynamically balance the model plasticity.

Continual Learning Incremental Learning +1

Mask-Guided Attention Network for Occluded Pedestrian Detection

1 code implementation ICCV 2019 Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao

Our approach obtains an absolute gain of 9. 5% in log-average miss rate, compared to the best reported results on the heavily occluded (HO) pedestrian set of CityPersons test set.

Pedestrian Detection

Multi-Modal Fusion for End-to-End RGB-T Tracking

1 code implementation30 Aug 2019 Lichao Zhang, Martin Danelljan, Abel Gonzalez-Garcia, Joost Van de Weijer, Fahad Shahbaz Khan

Our tracker is trained in an end-to-end manner, enabling the components to learn how to fuse the information from both modalities.

Image-to-Image Translation Rgb-T Tracking

3C-Net: Category Count and Center Loss for Weakly-Supervised Action Localization

1 code implementation ICCV 2019 Sanath Narayan, Hisham Cholakkal, Fahad Shahbaz Khan, Ling Shao

Our joint formulation has three terms: a classification term to ensure the separability of learned action features, an adapted multi-label center loss term to enhance the action feature discriminability and a counting loss term to delineate adjacent action sequences, leading to improved localization.

Action Classification Weakly Supervised Action Localization +2

Learning the Model Update for Siamese Trackers

1 code implementation ICCV 2019 Lichao Zhang, Abel Gonzalez-Garcia, Joost Van de Weijer, Martin Danelljan, Fahad Shahbaz Khan

In general, this template is linearly combined with the accumulated template from the previous frame, resulting in an exponential decay of information over time.

Visual Tracking

Distilled Siamese Networks for Visual Tracking

no code implementations24 Jul 2019 Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, Steven Hoi

This model is intuitively inspired by the one teacher vs. multiple students learning method typically employed in schools.

Knowledge Distillation Object Tracking +1

An Adaptive Random Path Selection Approach for Incremental Learning

1 code implementation3 Jun 2019 Jathushan Rajasegaran, Munawar Hayat, Salman Khan, Fahad Shahbaz Khan, Ling Shao, Ming-Hsuan Yang

In a conventional supervised learning setting, a machine learning model has access to examples of all object classes that are desired to be recognized during the inference stage.

Ranked #7 on Incremental Learning on ImageNet100 - 10 steps (Average Incremental Accuracy Top-5 metric)

Incremental Learning Knowledge Distillation +1

iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images

3 code implementations30 May 2019 Syed Waqas Zamir, Aditya Arora, Akshita Gupta, Salman Khan, Guolei Sun, Fahad Shahbaz Khan, Fan Zhu, Ling Shao, Gui-Song Xia, Xiang Bai

Compared to existing small-scale aerial image based instance segmentation datasets, iSAID contains 15$\times$ the number of object categories and 5$\times$ the number of instances.

Instance Segmentation object-detection +2

Cross-Domain Transferability of Adversarial Perturbations

1 code implementation NeurIPS 2019 Muzammal Naseer, Salman H. Khan, Harris Khan, Fahad Shahbaz Khan, Fatih Porikli

To this end, we propose a framework capable of launching highly transferable attacks that crafts adversarial patterns to mislead networks trained on wholly different domains.

Discriminative Online Learning for Fast Video Object Segmentation

no code implementations18 Apr 2019 Andreas Robinson, Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg

We propose a novel approach, based on a dedicated target appearance model that is exclusively learned online to discriminate between the target and background image regions.

One-shot visual object segmentation Semantic Segmentation +2

Learning Digital Camera Pipeline for Extreme Low-Light Imaging

no code implementations11 Apr 2019 Syed Waqas Zamir, Aditya Arora, Salman Khan, Fahad Shahbaz Khan, Ling Shao

In low-light conditions, a conventional camera imaging pipeline produces sub-optimal images that are usually dark and noisy due to a low photon count and low signal-to-noise ratio (SNR).

Object Counting and Instance Segmentation with Image-level Supervision

2 code implementations CVPR 2019 Hisham Cholakkal, Guolei Sun, Fahad Shahbaz Khan, Ling Shao

Moreover, our approach improves state-of-the-art image-level supervised instance segmentation with a relative gain of 17. 8% in terms of average best overlap, on the PASCAL VOC 2012 dataset.

Image-level Supervised Instance Segmentation Object Counting +1

ATOM: Accurate Tracking by Overlap Maximization

3 code implementations CVPR 2019 Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg

We argue that this approach is fundamentally limited since target estimation is a complex task, requiring high-level knowledge about the object.

General Classification Visual Object Tracking +1

Confidence Propagation through CNNs for Guided Sparse Depth Regression

1 code implementation5 Nov 2018 Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan

In this paper, we propose an algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work.

Autonomous Driving Depth Completion +1

Synthetic data generation for end-to-end thermal infrared tracking

no code implementations4 Jun 2018 Lichao Zhang, Abel Gonzalez-Garcia, Joost Van de Weijer, Martin Danelljan, Fahad Shahbaz Khan

These methods provide us with a large labeled dataset of synthetic TIR sequences, on which we can train end-to-end optimal features for tracking.

Image-to-Image Translation Synthetic Data Generation +2

Propagating Confidences through CNNs for Sparse Data Regression

1 code implementation30 May 2018 Abdelrahman Eldesokey, Michael Felsberg, Fahad Shahbaz Khan

To tackle this challenging problem, we introduce an algebraically-constrained convolution layer for CNNs with sparse input and demonstrate its capabilities for the scene depth completion task.

Autonomous Driving Depth Completion +1

Density Adaptive Point Set Registration

1 code implementation CVPR 2018 Felix Järemo Lawin, Martin Danelljan, Fahad Shahbaz Khan, Per-Erik Forssén, Michael Felsberg

Contrary to previous works, we model the underlying structure of the scene as a latent probability distribution, and thereby induce invariance to point set density changes.

Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification

no code implementations5 Jun 2017 Rao Muhammad Anwer, Fahad Shahbaz Khan, Joost Van de Weijer, Matthieu Molinier, Jorma Laaksonen

To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification.

Aerial Scene Classification General Classification +2

Deep Projective 3D Semantic Segmentation

1 code implementation9 May 2017 Felix Järemo Lawin, Martin Danelljan, Patrik Tosteberg, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg

Recent attempts, based on 3D deep learning approaches (3D-CNNs), have achieved below-expected results.

Deep Motion Features for Visual Tracking

no code implementations20 Dec 2016 Susanna Gladh, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg

To the best of our knowledge, we are the first to propose fusing appearance information with deep motion features for visual tracking.

Action Recognition Optical Flow Estimation +1

Scale Coding Bag of Deep Features for Human Attribute and Action Recognition

no code implementations14 Dec 2016 Fahad Shahbaz Khan, Joost Van de Weijer, Rao Muhammad Anwer, Andrew D. Bagdanov, Michael Felsberg, Jorma Laaksonen

Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding.

Action Recognition In Still Images

ECO: Efficient Convolution Operators for Tracking

2 code implementations CVPR 2017 Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg

Moreover, our fast variant, using hand-crafted features, operates at 60 Hz on a single CPU, while obtaining 65. 0% AUC on OTB-2015.

Visual Object Tracking

Discriminative Scale Space Tracking

no code implementations20 Sep 2016 Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg

Compared to the standard exhaustive scale search, our approach achieves a gain of 2. 5% in average overlap precision on the OTB dataset.

Visual Object Tracking

Learning Spatially Regularized Correlation Filters for Visual Tracking

no code implementations ICCV 2015 Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, Michael Felsberg

These methods utilize a periodic assumption of the training samples to efficiently learn a classifier on all patches in the target neighborhood.

Visual Tracking

A Probabilistic Framework for Color-Based Point Set Registration

no code implementations CVPR 2016 Martin Danelljan, Giulia Meneghetti, Fahad Shahbaz Khan, Michael Felsberg

On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline.

Cannot find the paper you are looking for? You can Submit a new open access paper.