Search Results for author: Muhammad Ferjad Naeem

Found 17 papers, 8 papers with code

GiT: Towards Generalist Vision Transformer through Universal Language Interface

2 code implementations • 14 Mar 2024 • Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, LiWei Wang

Due to its simple design, this paradigm holds promise for narrowing the architectural gap between vision and language.

Language Modelling

205

Paper
Code

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

no code implementations • 11 Mar 2024 • Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc van Gool, Didier Stricker, Muhammad Zeshan Afzal

We propose FocusCLIP, integrating subject-level guidance--a specialized mechanism for target-specific supervision--into the CLIP framework for improved zero-shot transfer on human-centric tasks.

Ranked #1 on Emotion Recognition on EMOTIC

Activity Recognition Age Classification +1

Paper
Add Code

Learning to Prompt with Text Only Supervision for Vision-Language Models

1 code implementation • 4 Jan 2024 • Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Muzammal Naseer, Luc van Gool, Federico Tombari

While effective, most of these works require labeled data which is not practical, and often struggle to generalize towards new datasets due to over-fitting on the source data.

Prompt Engineering

Paper
Code

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

1 code implementation • 27 Nov 2023 • Lukas Hoyer, David Joseph Tan, Muhammad Ferjad Naeem, Luc van Gool, Federico Tombari

In SemiVL, we propose to integrate rich priors from VLM pre-training into semi-supervised semantic segmentation to learn better semantic decision boundaries.

Ranked #1 on Semi-Supervised Semantic Segmentation on PASCAL VOC 2012 732 labeled (using extra training data)

Segmentation Semi-Supervised Semantic Segmentation

Paper
Code

SILC: Improving Vision Language Pretraining with Self-Distillation

no code implementations • 20 Oct 2023 • Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai, Lukas Hoyer, Luc van Gool, Federico Tombari

However, the contrastive objective used by these models only focuses on image-text alignment and does not incentivise image feature learning for dense prediction tasks.

Ranked #1 on Open Vocabulary Semantic Segmentation on PascalVOC-20b

Classification Contrastive Learning +8

Paper
Add Code

Introducing Language Guidance in Prompt-based Continual Learning

1 code implementation • ICCV 2023 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

While the model faces a disjoint set of classes in each task in this setting, we argue that these classes can be encoded to the same embedding space of a pre-trained language encoder.

Continual Learning

Paper
Code

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

no code implementations • CVPR 2023 • Muhammad Ferjad Naeem, Muhammad Gul Zain Ali Khan, Yongqin Xian, Muhammad Zeshan Afzal, Didier Stricker, Luc van Gool, Federico Tombari

Our proposed model, I2MVFormer, learns multi-view semantic embeddings for zero-shot image classification with these class views.

Classification Image Classification +3

Paper
Add Code

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations • 20 Oct 2022 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Paper
Add Code

I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification

no code implementations • 21 Sep 2022 • Muhammad Ferjad Naeem, Yongqin Xian, Luc van Gool, Federico Tombari

In order to distill discriminative visual words from noisy documents, we introduce a new cross-modal attention module that learns fine-grained interactions between image patches and document words.

Generalized Zero-Shot Learning Image Classification +2

Paper
Add Code

3D Compositional Zero-shot Learning with DeCompositional Consensus

no code implementations • 29 Nov 2021 • Muhammad Ferjad Naeem, Evin Pınar Örnek, Yongqin Xian, Luc van Gool, Federico Tombari

Parts represent a basic unit of geometric and semantic similarity across different objects.

Benchmarking Compositional Zero-Shot Learning +6

Paper
Add Code

Learning Graph Embeddings for Open World Compositional Zero-Shot Learning

2 code implementations • 3 May 2021 • Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata

In this work, we overcome this assumption operating on the open world setting, where no limit is imposed on the compositional space at test time, and the search space contains a large number of unseen compositions.

Compositional Zero-Shot Learning

107

Paper
Code

Learning Graph Embeddings for Compositional Zero-shot Learning

1 code implementation • CVPR 2021 • Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata

In compositional zero-shot learning, the goal is to recognize unseen compositions (e. g. old dog) of observed visual primitives states (e. g. old, cute) and objects (e. g. car, dog) in the training set.

Compositional Zero-Shot Learning Graph Embedding +1

107

Paper
Code

Open World Compositional Zero-Shot Learning

2 code implementations • CVPR 2021 • Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata

After estimating the feasibility score of each composition, we use these scores to either directly mask the output space or as a margin for the cosine similarity between visual features and compositional embeddings during training.

Compositional Zero-Shot Learning

107

Paper
Code

Reliable Fidelity and Diversity Metrics for Generative Models

3 code implementations • ICML 2020 • Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi, Jaejun Yoo

In this paper, we show that even the latest version of the precision and recall metrics are not reliable yet.

Image Generation

234

Paper
Code

Deep Learning Under the Microscope: Improving the Interpretability of Medical Imaging Neural Networks

no code implementations • 5 Apr 2019 • Magdalini Paschali, Muhammad Ferjad Naeem, Walter Simson, Katja Steiger, Martin Mollenhauer, Nassir Navab

In this paper, we propose a novel interpretation method tailored to histological Whole Slide Image (WSI) processing.

Decision Making General Classification +1

Paper
Add Code

Data Augmentation with Manifold Exploring Geometric Transformations for Increased Performance and Robustness

no code implementations • 14 Jan 2019 • Magdalini Paschali, Walter Simson, Abhijit Guha Roy, Muhammad Ferjad Naeem, Rüdiger Göbl, Christian Wachinger, Nassir Navab

Compared with traditional augmentation methods, and with images synthesized by Generative Adversarial Networks our method not only achieves state-of-the-art performance but also significantly improves the network's robustness.

Data Augmentation General Classification +2

Paper
Add Code

A Multi-faceted OCR Framework for Artificial Urdu News Ticker Text Recognition

no code implementations • 2018 13th IAPR International Workshop on Document Analysis Systems (DAS) 2018 • Sami-Ur-Rehman, Burhan Ul Tayyab, Muhammad Ferjad Naeem, Adnan Ul-Hasan, Faisal Shafait

We present the first comprehensive data set, to our knowledge, for Urdu news ticker recognition, collected from 41 different news channels.

Optical Character Recognition (OCR) Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.