Fine-Grained Visual Recognition
35 papers with code • 0 benchmarks • 5 datasets
Benchmarks
These leaderboards are used to track progress in Fine-Grained Visual Recognition
Libraries
Use these libraries to find Fine-Grained Visual Recognition models and implementationsLatest papers
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition
Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.
HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding
We explore constructing the class hierarchy into a graph, with its nodes representing the textual or image features of each category.
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery
This leads traditional novel category discovery (NCD) methods to be incapacitated for GCD, due to their assumption of unlabeled data are only from novel categories.
Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems
Specifically, we fit the GradCAM with a branch with limited fitting capacity, which allows the branch to capture the common rationales and discard the less common discriminative patterns.
Fine-Grained Visual Classification via Internal Ensemble Learning Transformer
The proposed IELT involves three main modules: multi-head voting (MHV) module, cross-layer refinement (CLR) module, and dynamic selection (DS) module.
Multi-View Active Fine-Grained Visual Recognition
Despite the remarkable progress of Fine-grained visual classification (FGVC) with years of history, it is still limited to recognizing 2 images.
Part-guided Relational Transformers for Fine-grained Visual Recognition
This framework, namely PArt-guided Relational Transformers (PART), is proposed to learn the discriminative part features with an automatic part discovery module, and to explore the intrinsic correlations with a feature transformation module by adapting the Transformer models from the field of natural language processing.
Penalizing the Hard Example But Not Too Much: A Strong Baseline for Fine-Grained Visual Classification
Second, we instantiate the loss function and provide a strong baseline for FGVC, where the performance of a naive backbone can be boosted and be comparable with recent methods.
Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism
In low data regimes, a network often struggles to choose the correct regions for recognition and tends to overfit spurious correlated patterns from the training data.
MemSAC: Memory Augmented Sample Consistency for Large Scale Unsupervised Domain Adaptation
Practical real world datasets with plentiful categories introduce new challenges for unsupervised domain adaptation like small inter-class discriminability, that existing approaches relying on domain invariance alone cannot handle sufficiently well.