Fine-Grained Image Classification

174 papers with code • 35 benchmarks • 36 datasets

Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.

( Image credit: Looking for the Devil in the Details )

Latest papers with no code

Context-Semantic Quality Awareness Network for Fine-Grained Visual Categorization

no code yet • 15 Mar 2024

To tackle this challenge, we propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for FGVC.

Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains

no code yet • 28 Feb 2024

Inspired by NC properties, we explore in this paper the transferability of DNN models trained with their last layer weight fixed according to ETF.

Masked Image Modeling via Dynamic Token Morphing

no code yet • 30 Dec 2023

Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among various self-supervised learning (SSL) methods.

Human in-the-Loop Estimation of Cluster Count in Datasets via Similarity-Driven Nested Importance Sampling

no code yet • 8 Dec 2023

Human feedback on the pairwise similarity can be used to improve the clustering, but existing approaches do not guarantee accurate count estimates.

OmniVec: Learning robust representations with cross modal sharing

no code yet • 7 Nov 2023

We demonstrate empirically that, using a joint network to train across modalities leads to meaningful information sharing and this allows us to achieve state-of-the-art results on most of the benchmarks.

Dining on Details: LLM-Guided Expert Networks for Fine-Grained Food Recognition

no code yet • MADiMa Workshop in ACM Multimedia 2023

Trained through an end-to-end multi-task learning process, this method enhances performance in the fine-grained food recognition task, showing exceptional prowess with highly similar classes.

Learning with Unmasked Tokens Drives Stronger Vision Learners

no code yet • 20 Oct 2023

MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly masking input tokens for the encoder to process, with the decoder reconstructing the masked tokens to the input.

Delving into Multimodal Prompting for Fine-grained Visual Classification

no code yet • 16 Sep 2023

In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model.

PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans

no code yet • 25 Aug 2023

Nearest neighbors (NN) are traditionally used to compute final decisions, e. g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision.

Deep Neural Networks Fused with Textures for Image Classification

no code yet • 3 Aug 2023

Fine-grained image classification (FGIC) is a challenging task in computer vision for due to small visual differences among inter-subcategories, but, large intra-class variations.