Search Results for author: Zi-Chao Zhang

Found 1 papers, 0 papers with code

ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator

no code implementations24 Mar 2022 Zi-Chao Zhang, Zhen-Duo Chen, Yongxin Wang, Xin Luo, Xin-Shun Xu

Recently, several Vision Transformer (ViT) based methods have been proposed for Fine-Grained Visual Classification (FGVC). These methods significantly surpass existing CNN-based ones, demonstrating the effectiveness of ViT in FGVC tasks. However, there are some limitations when applying ViT directly to FGVC. First, ViT needs to split images into patches and calculate the attention of every pair, which may result in heavy redundant calculation and unsatisfying performance when handling fine-grained images with complex background and small objects. Second, a standard ViT only utilizes the class token in the final layer for classification, which is not enough to extract comprehensive fine-grained information.

Fine-Grained Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.