Fine-Grained Image Classification

173 papers with code • 35 benchmarks • 36 datasets

Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.

( Image credit: Looking for the Devil in the Details )

Most implemented papers

Neural Architecture Transfer

human-analysis/neural-architecture-transfer 12 May 2020

At the same time, the architecture search and transfer is orders of magnitude more efficient than existing NAS methods.

Learning Semantically Enhanced Feature for Fine-Grained Image Classification

cswluo/SEF 24 Jun 2020

We aim to provide a computationally cheap yet effective approach for fine-grained image classification (FGIC) in this letter.

Concept Learners for Few-Shot Learning

snap-stanford/comet ICLR 2021

Developing algorithms that are able to generalize to a novel task given only a few labeled examples represents a fundamental challenge in closing the gap between machine- and human-level performance.

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data

Shaoli-Huang/SnapMix 9 Dec 2020

As the main discriminative information of a fine-grained image usually resides in subtle regions, methods along this line are prone to heavy label noise in fine-grained recognition.

Fine-Grained Visual Classification via Simultaneously Learning of Multi-regional Multi-grained Features

dongliangchang/Top-Down-Spatial-Attention-Loss 31 Jan 2021

Finally, we can obtain multiple discriminative regions on high-level feature channels and obtain multiple more minute regions within these discriminative regions on middle-level feature channels.

TransFG: A Transformer Architecture for Fine-grained Recognition

TACJu/TransFG 14 Mar 2021

Fine-grained visual classification (FGVC) which aims at recognizing objects from subcategories is a very challenging task due to the inherently subtle inter-class differences.

When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations

google-research/vision_transformer ICLR 2022

Vision Transformers (ViTs) and MLPs signal further efforts on replacing hand-wired features or inductive biases with general-purpose neural architectures.

AutoFormer: Searching Transformers for Visual Recognition

microsoft/AutoML ICCV 2021

Specifically, the performance of these subnets with weights inherited from the supernet is comparable to those retrained from scratch.

Self-Supervised Learning by Estimating Twin Class Distributions

bytedance/TWIST 14 Oct 2021

To solve this problem, we propose to maximize the mutual information between the input and the class predictions.

A Simple Episodic Linear Probe Improves Visual Recognition in the Wild

akira-l/ELP CVPR 2022

In this paper, we propose an episodic linear probing (ELP) classifier to reflect the generalization of visual representations in an online manner.