Search Results for author: Muhammad Zeshan Afzal

Found 29 papers, 9 papers with code

Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer

no code implementations11 Jul 2024 Tahira Shehzadi, Ifza, Didier Stricker, Muhammad Zeshan Afzal

The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision.

Data Augmentation Object +3

Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation

1 code implementation22 Jun 2024 Muhammad Saif Ullah Khan, Muhammad Zeshan Afzal, Didier Stricker

Reconstructing texture-less surfaces poses unique challenges in computer vision, primarily due to the lack of specialized datasets that cater to the nuanced needs of depth and normals estimation in the absence of textural information.

Decoder Object Reconstruction

Situational Instructions Database: Task Guidance in Dynamic Environments

1 code implementation19 Jun 2024 Muhammad Saif Ullah Khan, Sankalp Sinha, Didier Stricker, Muhammad Zeshan Afzal

The Situational Instructions Database (SID) addresses the need for enhanced situational awareness in artificial intelligence (AI) systems operating in dynamic environments.

Decision Making

UnSupDLA: Towards Unsupervised Document Layout Analysis

no code implementations10 Jun 2024 Talha Uddin Sheikh, Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal

Moreover, the diversity of documents online presents a unique set of challenges in maintaining the quality and consistency of these labels, further complicating document layout analysis in the digital era.

Diversity Document Layout Analysis +2

End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents

no code implementations8 May 2024 Iqraa Ehsan, Tahira Shehzadi, Didier Stricker, Muhammad Zeshan Afzal

Table detection, a pivotal task in document analysis, aims to precisely recognize and locate tables within document images.

Table Detection

CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification

no code implementations6 May 2024 Sankalp Sinha, Muhammad Saif Ullah Khan, Talha Uddin Sheikh, Didier Stricker, Muhammad Zeshan Afzal

We provide a comprehensive document image classification analysis in Zero-Shot Learning (ZSL) and Generalized Zero-Shot Learning (GZSL) settings to address this gap.

Document Classification Document Image Classification +1

Towards End-to-End Semi-Supervised Table Detection with Semantic Aligned Matching Transformer

no code implementations30 Apr 2024 Tahira Shehzadi, Shalini Sarode, Didier Stricker, Muhammad Zeshan Afzal

However, recent advancements in the field have shifted the focus towards transformer-based techniques, eliminating the need for NMS and emphasizing object queries and attention mechanisms.

Object Table Detection

A Hybrid Approach for Document Layout Analysis in Document images

no code implementations27 Apr 2024 Tahira Shehzadi, Didier Stricker, Muhammad Zeshan Afzal

This paper navigates the complexities of understanding various elements within document images, such as text, images, tables, and headings.

Contrastive Learning Decoder +5

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection

no code implementations CVPR 2024 Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal

In this paper, we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework, particularly focusing on the challenges posed by the quality of object queries.

Object object-detection +4

FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks

no code implementations11 Mar 2024 Muhammad Saif Ullah Khan, Muhammad Ferjad Naeem, Federico Tombari, Luc van Gool, Didier Stricker, Muhammad Zeshan Afzal

We propose FocusCLIP, integrating subject-level guidance--a specialized mechanism for target-specific supervision--into the CLIP framework for improved zero-shot transfer on human-centric tasks.

Activity Recognition Age Classification +1

Introducing Language Guidance in Prompt-based Continual Learning

1 code implementation ICCV 2023 Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Didier Stricker, Federico Tombari, Muhammad Zeshan Afzal

While the model faces a disjoint set of classes in each task in this setting, we argue that these classes can be encoded to the same embedding space of a pre-trained language encoder.

Continual Learning

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

no code implementations23 Jun 2023 Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Marcus Liwicki, Muhammad Zeshan Afzal

Upon integrating query modifications in the DETR, we outperform prior works and achieve new state-of-the-art results with the mAP of 96. 9\%, 95. 7\% and 99. 3\% on TableBank, PubLaynet, PubTables, respectively.

Document Layout Analysis Object +2

Object Detection with Transformers: A Review

2 code implementations7 Jun 2023 Tahira Shehzadi, Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal

The astounding performance of transformers in natural language processing (NLP) has motivated researchers to explore their applications in computer vision tasks.

Object object-detection +1

Robust and Fast Vehicle Detection using Augmented Confidence Map

no code implementations27 Apr 2023 Hamam Mokayed, Palaiahnakote Shivakumara, Lama Alkhaled, Rajkumar Saini, Muhammad Zeshan Afzal, Yan Chai Hum, Marcus Liwicki

Vehicle detection in real-time scenarios is challenging because of the time constraints and the presence of multiple types of vehicles with different speeds, shapes, structures, etc.

Fast Vehicle Detection

Learning Attention Propagation for Compositional Zero-Shot Learning

no code implementations20 Oct 2022 Muhammad Gul Zain Ali Khan, Muhammad Ferjad Naeem, Luc van Gool, Alain Pagani, Didier Stricker, Muhammad Zeshan Afzal

CAPE learns to identify this structure and propagates knowledge between them to learn class embedding for all seen and unseen compositions.

Compositional Zero-Shot Learning

Recognizing Challenging Handwritten Annotations with Fully Convolutional Networks

no code implementations1 Apr 2018 Andreas Kölsch, Ashutosh Mishra, Saurabh Varshneya, Muhammad Zeshan Afzal, Marcus Liwicki

This paper introduces a very challenging dataset of historic German documents and evaluates Fully Convolutional Neural Network (FCNN) based methods to locate handwritten annotations of any kind in these documents.

Data Augmentation Semantic Segmentation

Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

5 code implementations11 Apr 2017 Muhammad Zeshan Afzal, Andreas Kölsch, Sheraz Ahmed, Marcus Liwicki

We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half.

Document Image Classification General Classification +2

TAC-GAN - Text Conditioned Auxiliary Classifier Generative Adversarial Network

4 code implementations19 Mar 2017 Ayushman Dash, John Cristian Borges Gamboa, Sheraz Ahmed, Marcus Liwicki, Muhammad Zeshan Afzal

In this work, we present the Text Conditioned Auxiliary Classifier Generative Adversarial Network, (TAC-GAN) a text to image Generative Adversarial Network (GAN) for synthesizing images from their text descriptions.

Diversity Generative Adversarial Network +2

Multilevel Context Representation for Improving Object Recognition

no code implementations19 Mar 2017 Andreas Kölsch, Muhammad Zeshan Afzal, Marcus Liwicki

In this work, we propose the combined usage of low- and high-level blocks of convolutional neural networks (CNNs) for improving object recognition.

Data Augmentation Object +2

A Generic Method for Automatic Ground Truth Generation of Camera-captured Documents

no code implementations4 May 2016 Sheraz Ahmed, Muhammad Imran Malik, Muhammad Zeshan Afzal, Koichi Kise, Masakazu Iwamura, Andreas Dengel, Marcus Liwicki

The method is generic, language independent and can be used for generation of labeled documents datasets (both scanned and cameracaptured) in any cursive and non-cursive language, e. g., English, Russian, Arabic, Urdu, etc.

Optical Character Recognition (OCR)

Cannot find the paper you are looking for? You can Submit a new open access paper.