TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	ImageNet	Pyramid ViG-B	Top 1 Accuracy	83.7%	# 365
Image Classification	ImageNet	Pyramid ViG-B	Number of params	92.6M	# 854
Image Classification	ImageNet	Pyramid ViG-B	GFLOPs	16.8	# 351
Image Classification	ImageNet	Pyramid ViG-S	Top 1 Accuracy	82.1%	# 525
Image Classification	ImageNet	Pyramid ViG-S	Number of params	27.3M	# 623
Image Classification	ImageNet	Pyramid ViG-S	GFLOPs	4.6	# 215
Image Classification	ImageNet	Pyramid ViG-M	Top 1 Accuracy	83.1%	# 426
Image Classification	ImageNet	Pyramid ViG-M	Number of params	51.7M	# 734
Image Classification	ImageNet	Pyramid ViG-M	GFLOPs	8.9	# 284
Image Classification	ImageNet	Pyramid ViG-Ti	Top 1 Accuracy	78.2%	# 778
Image Classification	ImageNet	Pyramid ViG-Ti	Number of params	10.7M	# 482
Image Classification	ImageNet	Pyramid ViG-Ti	GFLOPs	1.7	# 135

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/vision-gnn-an-image-is-worth-graph-of-nodes/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=vision-gnn-an-image-is-worth-graph-of-nodes)`

Vision GNN: An Image is Worth Graph of Nodes

1 Jun 2022 · Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, Enhua Wu ·

Network architecture plays a key role in the deep learning-based computer vision system. The widely-used convolutional neural network and transformer treat the image as a grid or sequence structure, which is not flexible to capture irregular and complex objects. In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph-level feature for visual tasks. We first split the image to a number of patches which are viewed as nodes, and construct a graph by connecting the nearest neighbors. Based on the graph representation of images, we build our ViG model to transform and exchange information among all the nodes. ViG consists of two basic modules: Grapher module with graph convolution for aggregating and updating graph information, and FFN module with two linear layers for node feature transformation. Both isotropic and pyramid architectures of ViG are built with different model sizes. Extensive experiments on image recognition and object detection tasks demonstrate the superiority of our ViG architecture. We hope this pioneering study of GNN on general visual tasks will provide useful inspiration and experience for future research. The PyTorch code is available at https://github.com/huawei-noah/Efficient-AI-Backbones and the MindSpore code is available at https://gitee.com/mindspore/models.

PDF Abstract

Code

Add Remove Mark official

huawei-noah/CV-backbones official

3,803

huawei-noah/efficient-ai-backbones official

3,802

huawei-noah/ghostnet

3,803

iamhankai/ghostnet

3,802

huawei-noah/Efficient-AI-Backbones

3,802

See all 11 implementations

Tasks

Add Remove

Image Classification

Object Detection

Datasets

ImageNet

MS COCO

Results from the Paper

Edit

Ranked #365 on Image Classification on ImageNet

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	ImageNet	Pyramid ViG-B	Top 1 Accuracy	83.7%	# 365	Compare
			Number of params	92.6M	# 854	Compare
			GFLOPs	16.8	# 351	Compare
Image Classification	ImageNet	Pyramid ViG-S	Top 1 Accuracy	82.1%	# 525	Compare
			Number of params	27.3M	# 623	Compare
			GFLOPs	4.6	# 215	Compare
Image Classification	ImageNet	Pyramid ViG-M	Top 1 Accuracy	83.1%	# 426	Compare
			Number of params	51.7M	# 734	Compare
			GFLOPs	8.9	# 284	Compare
Image Classification	ImageNet	Pyramid ViG-Ti	Top 1 Accuracy	78.2%	# 778	Compare
			Number of params	10.7M	# 482	Compare
			GFLOPs	1.7	# 135	Compare

Methods

Add Remove

Convolution

Edit Social Preview

Vision GNN: An Image is Worth Graph of Nodes

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove