Fine-Grained Image Classification
174 papers with code • 35 benchmarks • 36 datasets
Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.
( Image credit: Looking for the Devil in the Details )
Libraries
Use these libraries to find Fine-Grained Image Classification models and implementationsDatasets
Latest papers with no code
Context-Semantic Quality Awareness Network for Fine-Grained Visual Categorization
To tackle this challenge, we propose a weakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) for FGVC.
Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains
Inspired by NC properties, we explore in this paper the transferability of DNN models trained with their last layer weight fixed according to ETF.
Masked Image Modeling via Dynamic Token Morphing
Masked Image Modeling (MIM) arises as a promising option for Vision Transformers among various self-supervised learning (SSL) methods.
Human in-the-Loop Estimation of Cluster Count in Datasets via Similarity-Driven Nested Importance Sampling
Human feedback on the pairwise similarity can be used to improve the clustering, but existing approaches do not guarantee accurate count estimates.
OmniVec: Learning robust representations with cross modal sharing
We demonstrate empirically that, using a joint network to train across modalities leads to meaningful information sharing and this allows us to achieve state-of-the-art results on most of the benchmarks.
Dining on Details: LLM-Guided Expert Networks for Fine-Grained Food Recognition
Trained through an end-to-end multi-task learning process, this method enhances performance in the fine-grained food recognition task, showing exceptional prowess with highly similar classes.
Learning with Unmasked Tokens Drives Stronger Vision Learners
MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly masking input tokens for the encoder to process, with the decoder reconstructing the masked tokens to the input.
Delving into Multimodal Prompting for Fine-grained Visual Classification
In this paper, we aim to fully exploit the capabilities of cross-modal description to tackle FGVC tasks and propose a novel multimodal prompting solution, denoted as MP-FGVC, based on the contrastive language-image pertaining (CLIP) model.
PCNN: Probable-Class Nearest-Neighbor Explanations Improve Fine-Grained Image Classification Accuracy for AIs and Humans
Nearest neighbors (NN) are traditionally used to compute final decisions, e. g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision.
Deep Neural Networks Fused with Textures for Image Classification
Fine-grained image classification (FGIC) is a challenging task in computer vision for due to small visual differences among inter-subcategories, but, large intra-class variations.