Search Results for author: Muhammad Zeshan Afzal

Found 19 papers, 6 papers with code

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection

no code implementations • 2 Apr 2024 • Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal

In this paper, we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework, particularly focusing on the challenges posed by the quality of object queries.

Object object-detection +4

Paper
Add Code

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

no code implementations • 11 Mar 2024 • Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc van Gool, Didier Stricker, Muhammad Zeshan Afzal

We propose FocusCLIP, integrating subject-level guidance--a specialized mechanism for target-specific supervision--into the CLIP framework for improved zero-shot transfer on human-centric tasks.

Ranked #1 on Emotion Recognition on EMOTIC

Activity Recognition Age Classification +1

Paper
Add Code

Introducing Language Guidance in Prompt-based Continual Learning

1 code implementation • ICCV 2023 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

While the model faces a disjoint set of classes in each task in this setting, we argue that these classes can be encoded to the same embedding space of a pre-trained language encoder.

Continual Learning

Paper
Code

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

no code implementations • 23 Jun 2023 • Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, Muhammad Zeshan Afzal

Upon integrating query modifications in the DETR, we outperform prior works and achieve new state-of-the-art results with the mAP of 96. 9\%, 95. 7\% and 99. 3\% on TableBank, PubLaynet, PubTables, respectively.

Ranked #3 on Document Layout Analysis on PubLayNet val

Document Layout Analysis Object +2

Paper
Add Code

Object Detection with Transformers: A Review

2 code implementations • 7 Jun 2023 • Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal

The astounding performance of transformers in natural language processing (NLP) has motivated researchers to explore their applications in computer vision tasks.

Object object-detection +1

Paper
Code

Towards End-to-End Semi-Supervised Table Detection with Deformable Transformer

no code implementations • 4 May 2023 • Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, Muhammad Zeshan Afzal

Table detection is the task of classifying and localizing table objects within document images.

Table Detection

Paper
Add Code

Robust and Fast Vehicle Detection using Augmented Confidence Map

no code implementations • 27 Apr 2023 • Hamam Mokayed, Palaiahnakote Shivakumara, Lama Alkhaled, Rajkumar Saini, Muhammad Zeshan Afzal, Yan Chai Hum, Marcus Liwicki

Vehicle detection in real-time scenarios is challenging because of the time constraints and the presence of multiple types of vehicles with different speeds, shapes, structures, etc.

Fast Vehicle Detection

Paper
Add Code

I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification

no code implementations • CVPR 2023 • Muhammad Ferjad Naeem, Muhammad Gul Zain Ali Khan, Yongqin Xian, Muhammad Zeshan Afzal, Didier Stricker, Luc van Gool, Federico Tombari

Our proposed model, I2MVFormer, learns multi-view semantic embeddings for zero-shot image classification with these class views.

Classification Image Classification +3

Paper
Add Code

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations • 20 Oct 2022 • Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Paper
Add Code

SemAttNet: Towards Attention-based Semantic Aware Guided Depth Completion

1 code implementation • 28 Apr 2022 • Danish Nazir, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal

Depth completion involves recovering a dense depth map from a sparse map and an RGB image.

Ranked #1 on Depth Completion on KITTI Depth Completion

Depth Completion

Paper
Code

Current Status and Performance Analysis of Table Recognition in Document Images with Deep Neural Networks

no code implementations • 29 Apr 2021 • Khurram Azeem Hashmi, Marcus Liwicki, Didier Stricker, Muhammad Adnan Afzal, Muhammad Ahtsham Afzal, Muhammad Zeshan Afzal

Table understanding has substantially benefited from the recent breakthroughs in deep neural networks.

Table Detection Table Recognition

Paper
Add Code

Guided Table Structure Recognition through Anchor Optimization

no code implementations • 21 Apr 2021 • Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, Muhammad Noman Afzal, Muhammad Zeshan Afzal

Subsequently, these anchors are exploited to locate the rows and columns in tabular images.

Ranked #1 on Table Recognition on ICDAR2013 table structure recognition

Instance Segmentation object-detection +2

Paper
Add Code

Recognizing Challenging Handwritten Annotations with Fully Convolutional Networks

no code implementations • 1 Apr 2018 • Andreas Kölsch, Ashutosh Mishra, Saurabh Varshneya, Muhammad Zeshan Afzal, Marcus Liwicki

This paper introduces a very challenging dataset of historic German documents and evaluates Fully Convolutional Neural Network (FCNN) based methods to locate handwritten annotations of any kind in these documents.

Data Augmentation Semantic Segmentation

Paper
Add Code

Real-Time Document Image Classification using Deep CNN and Extreme Learning Machines

no code implementations • 3 Nov 2017 • Andreas Kölsch, Muhammad Zeshan Afzal, Markus Ebbecke, Marcus Liwicki

This paper presents an approach for real-time training and testing for document image classification.

Classification Document Classification +2

Paper
Add Code

Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

5 code implementations • 11 Apr 2017 • Muhammad Zeshan Afzal, Andreas Kölsch, Sheraz Ahmed, Marcus Liwicki

We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half.

Ranked #27 on Document Image Classification on RVL-CDIP

Document Image Classification General Classification +2

18,315

Paper
Code

TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network

4 code implementations • 19 Mar 2017 • Ayushman Dash, John Cristian Borges Gamboa, Sheraz Ahmed, Marcus Liwicki, Muhammad Zeshan Afzal

In this work, we present the Text Conditioned Auxiliary Classifier Generative Adversarial Network, (TAC-GAN) a text to image Generative Adversarial Network (GAN) for synthesizing images from their text descriptions.

Generative Adversarial Network MS-SSIM +1

Paper
Code

Multilevel Context Representation for Improving Object Recognition

no code implementations • 19 Mar 2017 • Andreas Kölsch, Muhammad Zeshan Afzal, Marcus Liwicki

In this work, we propose the combined usage of low- and high-level blocks of convolutional neural networks (CNNs) for improving object recognition.

Data Augmentation Object +2

Paper
Add Code

A Generic Method for Automatic Ground Truth Generation of Camera-captured Documents

no code implementations • 4 May 2016 • Sheraz Ahmed, Muhammad Imran Malik, Muhammad Zeshan Afzal, Koichi Kise, Masakazu Iwamura, Andreas Dengel, Marcus Liwicki

The method is generic, language independent and can be used for generation of labeled documents datasets (both scanned and cameracaptured) in any cursive and non-cursive language, e. g., English, Russian, Arabic, Urdu, etc.

Optical Character Recognition (OCR)

Paper
Add Code

DeXpression: Deep Convolutional Neural Network for Expression Recognition

3 code implementations • 17 Sep 2015 • Peter Burkert, Felix Trier, Muhammad Zeshan Afzal, Andreas Dengel, Marcus Liwicki

The proposed architecture achieves 99. 6% for CKP and 98. 63% for MMI, therefore performing better than the state of the art using CNNs.

Ranked #1 on Facial Expression Recognition (FER) on MMI

Emotion Recognition Facial Expression Recognition +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.