Search Results for author: Muhammad Awais

Found 42 papers, 20 papers with code

Rethinking Positive Pairs in Contrastive Learning

no code implementations23 Oct 2024 Jiantao Wu, Shentong Mo, ZhenHua Feng, Sara Atito, Josef Kitler, Muhammad Awais

We challenge this assumption by proposing to learn from arbitrary pairs, allowing any pair of samples to be positive within our framework. The primary challenge of the proposed approach lies in applying contrastive learning to disparate pairs which are semantically distant.

Contrastive Learning Representation Learning

AgroGPT: Efficient Agricultural Vision-Language Model with Expert Tuning

1 code implementation10 Oct 2024 Muhammad Awais, Ali Husain Salem Abdulla Alharthi, Amandeep Kumar, Hisham Cholakkal, Rao Muhammad Anwer

In this work, we propose an approach to construct instruction-tuning data that harnesses vision-only data for the agriculture domain.

Language Modelling

AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment

1 code implementation2 Oct 2024 Umair Nawaz, Muhammad Awais, Hanan Gani, Muzammal Naseer, Fahad Khan, Salman Khan, Rao Muhammad Anwer

Further, this domain desires fine-grained feature learning due to the subtle nature of the downstream tasks (e. g, nutrient deficiency detection, livestock breed classification).

Self-Supervised Learning Zero-Shot Learning

Probabilistically Aligned View-unaligned Clustering with Adaptive Template Selection

no code implementations23 Sep 2024 Wenhua Dong, Xiao-Jun Wu, ZhenHua Feng, Sara Atito, Muhammad Awais, Josef Kittler

In most existing multi-view modeling scenarios, cross-view correspondence (CVC) between instances of the same target from different views, like paired image-text data, is a crucial prerequisite for effortlessly deriving a consistent representation.

Clustering

BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning

1 code implementation14 Aug 2024 Asif Hanif, Fahad Shamshad, Muhammad Awais, Muzammal Naseer, Fahad Shahbaz Khan, Karthik Nandakumar, Salman Khan, Rao Muhammad Anwer

Inspired by the latest developments in learnable prompts, this work introduces a method to embed a backdoor into the medical foundation model during the prompt learning phase.

Backdoor Attack

C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition

1 code implementation8 Jul 2024 Rongchang Li, ZhenHua Feng, Tianyang Xu, Linze Li, Xiao-Jun Wu, Muhammad Awais, Sara Atito, Josef Kittler

For evaluating the task, we construct a new benchmark, Something-composition (Sth-com), based on the widely used Something-Something V2 dataset.

Action Recognition

BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases

1 code implementation27 Jun 2024 Muhammad Awais, Mehaboobathunnisa Sahul Hameed, Bidisha Bhattacharya, Orly Reiner, Rao Muhammad Anwer

Quantifying cellular processes like mitosis in these organoids offers insights into neurodevelopmental disorders, but the manual analysis is time-consuming, and existing datasets lack specific details for brain organoid studies.

object-detection Object Detection

Pseudo Labelling for Enhanced Masked Autoencoders

no code implementations25 Jun 2024 Srinivasa Rao Nandam, Sara Atito, ZhenHua Feng, Josef Kittler, Muhammad Awais

The targets for pseudo labelling and reconstruction needs to be generated by a teacher network.

Semantic Segmentation

Investigating Self-Supervised Methods for Label-Efficient Learning

no code implementations25 Jun 2024 Srinivasa Rao Nandam, Sara Atito, ZhenHua Feng, Josef Kittler, Muhammad Awais

Vision transformers combined with self-supervised learning have enabled the development of models which scale across large datasets for several downstream tasks like classification, segmentation and detection.

Classification Clustering +6

Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning

1 code implementation6 Jun 2024 Amandeep Kumar, Muhammad Awais, Sanath Narayan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer

The LAE harnesses a pre-trained vision-language model to find text-guided attribute-specific editing direction in the latent space of any pre-trained 3D-aware GAN.

Attribute Language Modelling

MAX-AST: COMBINING CONVOLUTION, LOCAL AND GLOBAL SELF-ATTENTIONS FOR AUDIO EVENT CLASSIFICATION

1 code implementation ICASSP 2024 Tony Alex, Sara Ahmed, Armin Mustafa, Muhammad Awais, Philip JB Jackson

In the domain of audio transformer architectures, prior research has extensively investigated isotropic architectures that capture the global context through full self-attention and hierarchical architectures that progressively transition from local to global context utilising hierarchical structures with convolutions or window-based attention.

Audio Classification

DailyMAE: Towards Pretraining Masked Autoencoders in One Day

1 code implementation31 Mar 2024 Jiantao Wu, Shentong Mo, Sara Atito, ZhenHua Feng, Josef Kittler, Muhammad Awais

Recently, masked image modeling (MIM), an important self-supervised learning (SSL) method, has drawn attention for its effectiveness in learning data representation from unlabeled data.

Self-Supervised Learning

DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification

1 code implementation AAAI 2024 Tony Alex, Sara Ahmed, Armin Mustafa, Muhammad Awais, Philip JB Jackson

Convolutional neural networks (CNNs) and Transformer-based networks have recently enjoyed significant attention for various audio classification and tagging tasks following their wide adoption in the computer vision domain.

Audio Classification Information Retrieval

DiCoM -- Diverse Concept Modeling towards Enhancing Generalizability in Chest X-Ray Studies

no code implementations22 Feb 2024 Abhijeet Parida, Daniel Capellan-Martin, Sara Atito, Muhammad Awais, Maria J. Ledesma-Carbayo, Marius G. Linguraru, Syed Muhammad Anwar

In this context, we introduce Diverse Concept Modeling (DiCoM), a novel self-supervised training paradigm that leverages a student teacher framework for learning diverse concepts and hence effective representation of the CXR data.

LT-ViT: A Vision Transformer for multi-label Chest X-ray classification

no code implementations13 Nov 2023 Umar Marikkar, Sara Atito, Muhammad Awais, Adam Mahdi

Vision Transformers (ViTs) are widely adopted in medical imaging tasks, and some existing efforts have been directed towards vision-language training for Chest X-rays (CXRs).

Masked Momentum Contrastive Learning for Zero-shot Semantic Understanding

no code implementations22 Aug 2023 Jiantao Wu, Shentong Mo, Muhammad Awais, Sara Atito, ZhenHua Feng, Josef Kittler

Self-supervised pretraining (SSP) has emerged as a popular technique in machine learning, enabling the extraction of meaningful feature representations without labelled data.

Contrastive Learning Object +6

Foundational Models Defining a New Era in Vision: A Survey and Outlook

1 code implementation25 Jul 2023 Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Fahad Shahbaz Khan

Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.

Benchmarking

Variantional autoencoder with decremental information bottleneck for disentanglement

1 code implementation22 Mar 2023 Jiantao Wu, Shentong Mo, Muhammad Awais, Sara Atito, Xingshen Zhang, Lin Wang, Xiang Yang

One major challenge of disentanglement learning with variational autoencoders is the trade-off between disentanglement and reconstruction fidelity.

Disentanglement

SPCXR: Self-supervised Pretraining using Chest X-rays Towards a Domain Specific Foundation Model

no code implementations23 Nov 2022 Syed Muhammad Anwar, Abhijeet Parida, Sara Atito, Muhammad Awais, Gustavo Nino, Josef Kitler, Marius George Linguraru

However, the traditional diagnostic tool design methods based on supervised learning are burdened by the need to provide training data annotation, which should be of good quality for better clinical outcomes.

COVID-19 Diagnosis Image Segmentation +3

ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification

1 code implementation23 Nov 2022 Sara Atito, Muhammad Awais, Wenwu Wang, Mark D Plumbley, Josef Kittler

Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.

Keyword Spotting Self-Supervised Learning +1

SB-SSL: Slice-Based Self-Supervised Transformers for Knee Abnormality Classification from MRI

no code implementations29 Aug 2022 Sara Atito, Syed Muhammad Anwar, Muhammad Awais, Josef Kitler

The availability of large scale data with high quality ground truth labels is a challenge when developing supervised machine learning solutions for healthcare domain.

Self-Supervised Learning

GMML is All you Need

1 code implementation30 May 2022 Sara Atito, Muhammad Awais, Josef Kittler

This has motivated the research in self-supervised transformer pretraining, which does not need to decode the semantic information conveyed by labels to link it to the image properties, but rather focuses directly on extracting a concise representation of the image data that reflects the notion of similarity, and is invariant to nuisance factors.

Data Augmentation Self-Learning +1

MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning

no code implementations30 Nov 2021 Sara Atito, Muhammad Awais, Ammarah Farooq, ZhenHua Feng, Josef Kittler

In this aspect the proposed SSL frame-work MC-SSL0. 0 is a step towards Multi-Concept Self-Supervised Learning (MC-SSL) that goes beyond modelling single dominant label in an image to effectively utilise the information from all the concepts present in it.

Image Classification Self-Supervised Learning +1

Global Interaction Modelling in Vision Transformer via Super Tokens

no code implementations25 Nov 2021 Ammarah Farooq, Muhammad Awais, Sara Ahmed, Josef Kittler

Hence, most of the learning is independent of the image patches $(N)$ in the higher layers, and the class embedding is learned solely based on the Super tokens $(N/M^2)$ where $M^2$ is the window size.

Image Classification Representation Learning

MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

no code implementations NeurIPS 2021 Muhammad Awais, Fengwei Zhou, Chuanlong Xie, Jiawei Li, Sung-Ho Bae, Zhenguo Li

First, we theoretically show the transferability of robustness from an adversarially trained teacher model to a student model with the help of mixup augmentation.

Transfer Learning

Adversarial Robustness for Unsupervised Domain Adaptation

no code implementations ICCV 2021 Muhammad Awais, Fengwei Zhou, Hang Xu, Lanqing Hong, Ping Luo, Sung-Ho Bae, Zhenguo Li

Extensive Unsupervised Domain Adaptation (UDA) studies have shown great success in practice by learning transferable representations across a labeled source domain and an unlabeled target domain with deep models.

Adversarial Robustness Unsupervised Domain Adaptation

SiT: Self-supervised vIsion Transformer

2 code implementations8 Apr 2021 Sara Atito, Muhammad Awais, Josef Kittler

We also observed that SiT is good for few shot learning and also showed that it is learning useful representation by simply training a linear classifier on top of the learned features from SiT.

Few-Shot Learning Self-Supervised Learning

NPT-Loss: A Metric Loss with Implicit Mining for Face Recognition

no code implementations5 Mar 2021 Syed Safwan Khalid, Muhammad Awais, Chi-Ho Chan, ZhenHua Feng, Ammarah Farooq, Ali Akbari, Josef Kittler

One key ingredient of DCNN-based FR is the appropriate design of a loss function that ensures discrimination between various identities.

Face Recognition Triplet

A Flatter Loss for Bias Mitigation in Cross-dataset Facial Age Estimation

no code implementations20 Oct 2020 Ali Akbari, Muhammad Awais, Zhen-Hua Feng, Ammarah Farooq, Josef Kittler

Compared with existing loss functions, the lower gradient of the proposed loss function leads to the convergence of SGD to a better optimum point, and consequently a better generalisation.

Age Estimation Benchmarking +1

Deep Convolutional Neural Network Ensembles using ECOC

no code implementations7 Sep 2020 Sara Atito Ali Ahmed, Cemre Zor, Berrin Yanikoglu, Muhammad Awais, Josef Kittler

Deep neural networks have enhanced the performance of decision making systems in many applications including image understanding, and further gains can be achieved by constructing ensembles.

Decision Making

Towards an Adversarially Robust Normalization Approach

1 code implementation19 Jun 2020 Muhammad Awais, Fahad Shamshad, Sung-Ho Bae

In this paper, we investigate how BatchNorm causes this vulnerability and proposed new normalization that is robust to adversarial attacks.

Leveraging Deep Stein's Unbiased Risk Estimator for Unsupervised X-ray Denoising

1 code implementation29 Nov 2018 Fahad Shamshad, Muhammad Awais, Muhammad Asim, Zain ul Aabidin Lodhi, Muhammad Umair, Ali Ahmed

Among the plethora of techniques devised to curb the prevalence of noise in medical images, deep learning based approaches have shown the most promise.

Deep Learning Denoising

Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks

6 code implementations CVPR 2018 Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, Xiao-Jun Wu

We present a new loss function, namely Wing loss, for robust facial landmark localisation with Convolutional Neural Networks (CNNs).

 Ranked #1 on Face Alignment on 300W (NME_inter-pupil (%, Common) metric)

Data Augmentation Face Alignment

3D Morphable Models as Spatial Transformer Networks

1 code implementation23 Aug 2017 Anil Bas, Patrik Huber, William A. P. Smith, Muhammad Awais, Josef Kittler

In this paper, we show how a 3D Morphable Model (i. e. a statistical model of the 3D shape of a class of objects such as faces) can be used to spatially transform input data as a module (a 3DMM-STN) within a convolutional neural network.

Cannot find the paper you are looking for? You can Submit a new open access paper.