TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	MogaNet-L (UperNet)	Validation mIoU	50.9	# 100
Semantic Segmentation	ADE20K	MogaNet-L (UperNet)	GFLOPs (512 x 512)	1176	# 21
Semantic Segmentation	ADE20K	MogaNet-S (Semantic FPN)	Validation mIoU	47.7	# 150
Semantic Segmentation	ADE20K	MogaNet-S (Semantic FPN)	GFLOPs (512 x 512)	189	# 8
Semantic Segmentation	ADE20K	MogaNet-S (UperNet)	Validation mIoU	49.2	# 128
Semantic Segmentation	ADE20K	MogaNet-S (UperNet)	GFLOPs (512 x 512)	946	# 13
Semantic Segmentation	ADE20K	MogaNet-B (UperNet)	Validation mIoU	50.1	# 113
Semantic Segmentation	ADE20K	MogaNet-B (UperNet)	GFLOPs (512 x 512)	1050	# 17
Semantic Segmentation	ADE20K	MogaNet-XL (UperNet)	Validation mIoU	54	# 63
Object Detection	COCO 2017 val	MogaNet-XL (Cascade Mask R-CNN)	AP	56.2	# 4
Object Detection	COCO 2017 val	MogaNet-T (Mask R-CNN 1x)	AP	42.6	# 21
Object Detection	COCO 2017 val	MogaNet-XT (Mask R-CNN 1x)	AP	40.7	# 23
Object Detection	COCO 2017 val	MogaNet-T (RetinaNet 1x)	AP	41.4	# 22
Object Detection	COCO 2017 val	MogaNet-XT (RetinaNet 1x)	AP	39.7	# 25
Object Detection	COCO 2017 val	MogaNet-L (Cascade Mask R-CNN)	AP	53.3	# 5
Object Detection	COCO 2017 val	MogaNet-B (Cascade Mask R-CNN)	AP	52.6	# 6
Object Detection	COCO 2017 val	MogaNet-S (Cascade Mask R-CNN)	AP	51.6	# 7
Object Detection	COCO 2017 val	MogaNet-L (Mask R-CNN 1x)	AP	49.4	# 11
Object Detection	COCO 2017 val	MogaNet-B (Mask R-CNN 1x)	AP	47.9	# 15
Object Detection	COCO 2017 val	MogaNet-S (Mask R-CNN 1x)	AP	46.7	# 18
Object Detection	COCO 2017 val	MogaNet-L (RetinaNet 1x)	AP	48.7	# 14
Object Detection	COCO 2017 val	MogaNet-B (RetinaNet 1x)	AP	47.7	# 16
Object Detection	COCO 2017 val	MogaNet-S (RetinaNet 1x)	AP	45.8	# 19
Instance Segmentation	COCO test-dev	MogaNet-B (Mask R-CNN 1x)	mask AP	43.2	# 48
Instance Segmentation	COCO test-dev	MogaNet-XL (Cascade Mask R-CNN)	mask AP	48.8	# 24
Instance Segmentation	COCO test-dev	MogaNet-L (Cascade Mask R-CNN)	mask AP	46.1	# 36
Instance Segmentation	COCO test-dev	MogaNet-B (Cascade Mask R-CNN)	mask AP	46	# 38
Instance Segmentation	COCO test-dev	MogaNet-S (Cascade Mask R-CNN)	mask AP	45.1	# 41
Instance Segmentation	COCO test-dev	MogaNet-L (Mask R-CNN 1x)	mask AP	44.1	# 45
Instance Segmentation	COCO test-dev	MogaNet-S (Mask R-CNN 1x)	mask AP	42.2	# 52
Instance Segmentation	COCO test-dev	MogaNet-T (Mask R-CNN 1x)	mask AP	39.1	# 80
Instance Segmentation	COCO test-dev	MogaNet-XT	mask AP	37.6	# 89
Instance Segmentation	COCO test-dev	MogaNet-T	mask AP	35.8	# 95
Pose Estimation	COCO val2017	MogaNet-T (256x192)	AP	73.2	# 4
Pose Estimation	COCO val2017	MogaNet-T (256x192)	AP50	90.1	# 3
Pose Estimation	COCO val2017	MogaNet-T (256x192)	AP75	81	# 3
Pose Estimation	COCO val2017	MogaNet-T (256x192)	AR	78.8	# 4
Instance Segmentation	COCO val2017	MogaNet-S (256x192)	AP50	90.7	# 1
Instance Segmentation	COCO val2017	MogaNet-S (256x192)	AP75	82.8	# 1
Pose Estimation	COCO val2017	MogaNet-S (384x288)	AP	76.4	# 2
Pose Estimation	COCO val2017	MogaNet-S (384x288)	AP50	91	# 2
Pose Estimation	COCO val2017	MogaNet-S (384x288)	AP75	83.3	# 2
Pose Estimation	COCO val2017	MogaNet-S (384x288)	AR	81.4	# 2
Pose Estimation	COCO val2017	MogaNet-B (384x288)	AP	77.3	# 1
Pose Estimation	COCO val2017	MogaNet-B (384x288)	AP50	91.4	# 1
Pose Estimation	COCO val2017	MogaNet-B (384x288)	AP75	84	# 1
Pose Estimation	COCO val2017	MogaNet-B (384x288)	AR	82.2	# 1
Pose Estimation	COCO val2017	MogaNet-S (256x192)	AP	74.9	# 3
Pose Estimation	COCO val2017	MogaNet-S (256x192)	AR	80.1	# 3
Image Classification	ImageNet	MogaNet-B	Top 1 Accuracy	84.3%	# 305
Image Classification	ImageNet	MogaNet-B	Number of params	44M	# 698
Image Classification	ImageNet	MogaNet-B	GFLOPs	9.9	# 296
Image Classification	ImageNet	MogaNet-T (256res)	Top 1 Accuracy	80%	# 664
Image Classification	ImageNet	MogaNet-T (256res)	Number of params	5.2M	# 410
Image Classification	ImageNet	MogaNet-T (256res)	GFLOPs	1.44	# 131
Image Classification	ImageNet	MogaNet-S	Top 1 Accuracy	83.4%	# 394
Image Classification	ImageNet	MogaNet-S	Number of params	25M	# 587
Image Classification	ImageNet	MogaNet-S	GFLOPs	5	# 231
Image Classification	ImageNet	MogaNet-XL (384res)	Top 1 Accuracy	87.8%	# 75
Image Classification	ImageNet	MogaNet-XL (384res)	Number of params	181M	# 885
Image Classification	ImageNet	MogaNet-XL (384res)	GFLOPs	102	# 450
Image Classification	ImageNet	MogaNet-XT (256res)	Top 1 Accuracy	77.2%	# 813
Image Classification	ImageNet	MogaNet-XT (256res)	Number of params	3M	# 366
Image Classification	ImageNet	MogaNet-XT (256res)	GFLOPs	1.04	# 106
Image Classification	ImageNet	MogaNet-L	Top 1 Accuracy	84.7%	# 281
Image Classification	ImageNet	MogaNet-L	Number of params	83M	# 810
Image Classification	ImageNet	MogaNet-L	GFLOPs	15.9	# 345
Video Prediction	Moving MNIST	Uniformer (SimVP 10x)	MSE	18.01	# 8
Video Prediction	Moving MNIST	Uniformer (SimVP 10x)	MAE	57.52	# 7
Video Prediction	Moving MNIST	HorNet (SimVP 10x)	MSE	17.4	# 5
Video Prediction	Moving MNIST	HorNet (SimVP 10x)	MAE	55.7	# 5
Video Prediction	Moving MNIST	HorNet (SimVP 10x)	SSIM	0.9624	# 6
Video Prediction	Moving MNIST	VAN (SimVP 10x)	MSE	16.21	# 4
Video Prediction	Moving MNIST	VAN (SimVP 10x)	MAE	53.57	# 4
Video Prediction	Moving MNIST	VAN (SimVP 10x)	SSIM	0.9646	# 5
Video Prediction	Moving MNIST	Poolformer (SimVP 10x)	MSE	20.96	# 13
Video Prediction	Moving MNIST	Poolformer (SimVP 10x)	MAE	64.31	# 12
Video Prediction	Moving MNIST	ConvMixer (SimVP 10x)	MSE	22.3	# 14
Video Prediction	Moving MNIST	ConvMixer (SimVP 10x)	MAE	67.37	# 13
Video Prediction	Moving MNIST	MLP-Mixer (SimVP 10x)	MSE	18.85	# 9
Video Prediction	Moving MNIST	MLP-Mixer (SimVP 10x)	MAE	59.86	# 9
Video Prediction	Moving MNIST	Swin (SimVP 10x)	MSE	19.11	# 10
Video Prediction	Moving MNIST	Swin (SimVP 10x)	MAE	59.84	# 8
Video Prediction	Moving MNIST	ViT (SimVP 10x)	MSE	19.74	# 11
Video Prediction	Moving MNIST	ViT (SimVP 10x)	MAE	61.65	# 11
Video Prediction	Moving MNIST	ViT (SimVP 10x)	SSIM	0.9539	# 10
Video Prediction	Moving MNIST	ConvNeXt (SimVP 10x)	MSE	17.58	# 6
Video Prediction	Moving MNIST	ConvNeXt (SimVP 10x)	MAE	55.76	# 6
Video Prediction	Moving MNIST	ConvNeXt (SimVP 10x)	SSIM	0.9617	# 8
Video Prediction	Moving MNIST	MogaNet (SimVP 10x)	MSE	15.67	# 3
Video Prediction	Moving MNIST	MogaNet (SimVP 10x)	MAE	51.84	# 3
Video Prediction	Moving MNIST	MogaNet (SimVP 10x)	SSIM	0.9661	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/instance-segmentation-on-coco-val2017)](https://paperswithcode.com/sota/instance-segmentation-on-coco-val2017?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/pose-estimation-on-coco-val2017)](https://paperswithcode.com/sota/pose-estimation-on-coco-val2017?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/video-prediction-on-moving-mnist)](https://paperswithcode.com/sota/video-prediction-on-moving-mnist?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/object-detection-on-coco-2017-val)](https://paperswithcode.com/sota/object-detection-on-coco-2017-val?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=efficient-multi-order-gated-aggregation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/efficient-multi-order-gated-aggregation/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=efficient-multi-order-gated-aggregation)`

MogaNet: Multi-order Gated Aggregation Network

7 Nov 2022 · Siyuan Li, Zedong Wang, Zicheng Liu, Cheng Tan, Haitao Lin, Di wu, ZhiYuan Chen, Jiangbin Zheng, Stan Z. Li ·

By contextualizing the kernel as global as possible, Modern ConvNets have shown great potential in computer vision tasks. However, recent progress on \textit{multi-order game-theoretic interaction} within deep neural networks (DNNs) reveals the representation bottleneck of modern ConvNets, where the expressive interactions have not been effectively encoded with the increased kernel size. To tackle this challenge, we propose a new family of modern ConvNets, dubbed MogaNet, for discriminative visual representation learning in pure ConvNet-based models with favorable complexity-performance trade-offs. MogaNet encapsulates conceptually simple yet effective convolutions and gated aggregation into a compact module, where discriminative features are efficiently gathered and contextualized adaptively. MogaNet exhibits great scalability, impressive efficiency of parameters, and competitive performance compared to state-of-the-art ViTs and ConvNets on ImageNet and various downstream vision benchmarks, including COCO object detection, ADE20K semantic segmentation, 2D\&3D human pose estimation, and video prediction. Notably, MogaNet hits 80.0\% and 87.8\% accuracy with 5.2M and 181M parameters on ImageNet-1K, outperforming ParC-Net and ConvNeXt-L, while saving 59\% FLOPs and 17M parameters, respectively. The source code is available at \url{https://github.com/Westlake-AI/MogaNet}.

PDF Abstract

Code

Add Remove Mark official

chengtan9907/OpenSTL official

↳ Quickstart in

Colab

573

Westlake-AI/openmixup official

570

Westlake-AI/MogaNet official

↳ Quickstart in

Colab

129

chengtan9907/simvpv2

↳ Quickstart in

Colab

573

leondgarse/keras_cv_attention_models

556

See all 6 implementations

Tasks

Add Remove

3D Human Pose Estimation

Image Classification

Instance Segmentation

object-detection

Object Detection

Pose Estimation

Representation Learning

Semantic Segmentation

Video Prediction

Datasets

ImageNet

MS COCO

MNIST

ADE20K

COCO-Stuff

Moving MNIST

FreiHAND

ExPose

Results from the Paper

Edit

Ranked #1 on Pose Estimation on COCO val2017

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	MogaNet-L (UperNet)	Validation mIoU	50.9	# 100	Compare
Semantic Segmentation	ADE20K	MogaNet-L (UperNet)	GFLOPs (512 x 512)	1176	# 21	Compare
Semantic Segmentation	ADE20K	MogaNet-S (Semantic FPN)	Validation mIoU	47.7	# 150	Compare
Semantic Segmentation	ADE20K	MogaNet-S (Semantic FPN)	GFLOPs (512 x 512)	189	# 8	Compare
Semantic Segmentation	ADE20K	MogaNet-S (UperNet)	Validation mIoU	49.2	# 128	Compare
Semantic Segmentation	ADE20K	MogaNet-S (UperNet)	GFLOPs (512 x 512)	946	# 13	Compare
Semantic Segmentation	ADE20K	MogaNet-B (UperNet)	Validation mIoU	50.1	# 113	Compare
Semantic Segmentation	ADE20K	MogaNet-B (UperNet)	GFLOPs (512 x 512)	1050	# 17	Compare
Semantic Segmentation	ADE20K	MogaNet-XL (UperNet)	Validation mIoU	54	# 63	Compare
Object Detection	COCO 2017 val	MogaNet-XL (Cascade Mask R-CNN)	AP	56.2	# 4	Compare
Object Detection	COCO 2017 val	MogaNet-T (Mask R-CNN 1x)	AP	42.6	# 21	Compare
Object Detection	COCO 2017 val	MogaNet-XT (Mask R-CNN 1x)	AP	40.7	# 23	Compare
Object Detection	COCO 2017 val	MogaNet-T (RetinaNet 1x)	AP	41.4	# 22	Compare
Object Detection	COCO 2017 val	MogaNet-XT (RetinaNet 1x)	AP	39.7	# 25	Compare
Object Detection	COCO 2017 val	MogaNet-L (Cascade Mask R-CNN)	AP	53.3	# 5	Compare
Object Detection	COCO 2017 val	MogaNet-B (Cascade Mask R-CNN)	AP	52.6	# 6	Compare
Object Detection	COCO 2017 val	MogaNet-S (Cascade Mask R-CNN)	AP	51.6	# 7	Compare
Object Detection	COCO 2017 val	MogaNet-L (Mask R-CNN 1x)	AP	49.4	# 11	Compare
Object Detection	COCO 2017 val	MogaNet-B (Mask R-CNN 1x)	AP	47.9	# 15	Compare
Object Detection	COCO 2017 val	MogaNet-S (Mask R-CNN 1x)	AP	46.7	# 18	Compare
Object Detection	COCO 2017 val	MogaNet-L (RetinaNet 1x)	AP	48.7	# 14	Compare
Object Detection	COCO 2017 val	MogaNet-B (RetinaNet 1x)	AP	47.7	# 16	Compare
Object Detection	COCO 2017 val	MogaNet-S (RetinaNet 1x)	AP	45.8	# 19	Compare
Instance Segmentation	COCO test-dev	MogaNet-B (Mask R-CNN 1x)	mask AP	43.2	# 48	Compare
Instance Segmentation	COCO test-dev	MogaNet-XL (Cascade Mask R-CNN)	mask AP	48.8	# 24	Compare
Instance Segmentation	COCO test-dev	MogaNet-L (Cascade Mask R-CNN)	mask AP	46.1	# 36	Compare
Instance Segmentation	COCO test-dev	MogaNet-B (Cascade Mask R-CNN)	mask AP	46	# 38	Compare
Instance Segmentation	COCO test-dev	MogaNet-S (Cascade Mask R-CNN)	mask AP	45.1	# 41	Compare
Instance Segmentation	COCO test-dev	MogaNet-L (Mask R-CNN 1x)	mask AP	44.1	# 45	Compare
Instance Segmentation	COCO test-dev	MogaNet-S (Mask R-CNN 1x)	mask AP	42.2	# 52	Compare
Instance Segmentation	COCO test-dev	MogaNet-T (Mask R-CNN 1x)	mask AP	39.1	# 80	Compare
Instance Segmentation	COCO test-dev	MogaNet-XT	mask AP	37.6	# 89	Compare
Instance Segmentation	COCO test-dev	MogaNet-T	mask AP	35.8	# 95	Compare
Pose Estimation	COCO val2017	MogaNet-T (256x192)	AP	73.2	# 4	Compare
			AP50	90.1	# 3	Compare
			AP75	81	# 3	Compare
			AR	78.8	# 4	Compare
Instance Segmentation	COCO val2017	MogaNet-S (256x192)	AP50	90.7	# 1	Compare
Instance Segmentation	COCO val2017	MogaNet-S (256x192)	AP75	82.8	# 1	Compare
Pose Estimation	COCO val2017	MogaNet-S (384x288)	AP	76.4	# 2	Compare
			AP50	91	# 2	Compare
			AP75	83.3	# 2	Compare
			AR	81.4	# 2	Compare
Pose Estimation	COCO val2017	MogaNet-B (384x288)	AP	77.3	# 1	Compare
			AP50	91.4	# 1	Compare
			AP75	84	# 1	Compare
			AR	82.2	# 1	Compare
Pose Estimation	COCO val2017	MogaNet-S (256x192)	AP	74.9	# 3	Compare
Pose Estimation	COCO val2017	MogaNet-S (256x192)	AR	80.1	# 3	Compare
Image Classification	ImageNet	MogaNet-B	Top 1 Accuracy	84.3%	# 305	Compare
			Number of params	44M	# 698	Compare
			GFLOPs	9.9	# 296	Compare
Image Classification	ImageNet	MogaNet-T (256res)	Top 1 Accuracy	80%	# 664	Compare
			Number of params	5.2M	# 410	Compare
			GFLOPs	1.44	# 131	Compare
Image Classification	ImageNet	MogaNet-S	Top 1 Accuracy	83.4%	# 394	Compare
			Number of params	25M	# 587	Compare
			GFLOPs	5	# 231	Compare
Image Classification	ImageNet	MogaNet-XL (384res)	Top 1 Accuracy	87.8%	# 75	Compare
			Number of params	181M	# 885	Compare
			GFLOPs	102	# 450	Compare
Image Classification	ImageNet	MogaNet-XT (256res)	Top 1 Accuracy	77.2%	# 813	Compare
			Number of params	3M	# 366	Compare
			GFLOPs	1.04	# 106	Compare
Image Classification	ImageNet	MogaNet-L	Top 1 Accuracy	84.7%	# 281	Compare
			Number of params	83M	# 810	Compare
			GFLOPs	15.9	# 345	Compare
Video Prediction	Moving MNIST	Uniformer (SimVP 10x)	MSE	18.01	# 8	Compare
Video Prediction	Moving MNIST	Uniformer (SimVP 10x)	MAE	57.52	# 7	Compare
Video Prediction	Moving MNIST	HorNet (SimVP 10x)	MSE	17.4	# 5	Compare
			MAE	55.7	# 5	Compare
			SSIM	0.9624	# 6	Compare
Video Prediction	Moving MNIST	VAN (SimVP 10x)	MSE	16.21	# 4	Compare
			MAE	53.57	# 4	Compare
			SSIM	0.9646	# 5	Compare
Video Prediction	Moving MNIST	Poolformer (SimVP 10x)	MSE	20.96	# 13	Compare
Video Prediction	Moving MNIST	Poolformer (SimVP 10x)	MAE	64.31	# 12	Compare
Video Prediction	Moving MNIST	ConvMixer (SimVP 10x)	MSE	22.3	# 14	Compare
Video Prediction	Moving MNIST	ConvMixer (SimVP 10x)	MAE	67.37	# 13	Compare
Video Prediction	Moving MNIST	MLP-Mixer (SimVP 10x)	MSE	18.85	# 9	Compare
Video Prediction	Moving MNIST	MLP-Mixer (SimVP 10x)	MAE	59.86	# 9	Compare
Video Prediction	Moving MNIST	Swin (SimVP 10x)	MSE	19.11	# 10	Compare
Video Prediction	Moving MNIST	Swin (SimVP 10x)	MAE	59.84	# 8	Compare
Video Prediction	Moving MNIST	ViT (SimVP 10x)	MSE	19.74	# 11	Compare
			MAE	61.65	# 11	Compare
			SSIM	0.9539	# 10	Compare
Video Prediction	Moving MNIST	ConvNeXt (SimVP 10x)	MSE	17.58	# 6	Compare
			MAE	55.76	# 6	Compare
			SSIM	0.9617	# 8	Compare
Video Prediction	Moving MNIST	MogaNet (SimVP 10x)	MSE	15.67	# 3	Compare
			MAE	51.84	# 3	Compare
			SSIM	0.9661	# 3	Compare

Methods

Add Remove

1x1 Convolution • Convolution • Dense Connections • Gated Convolution • GLU • Residual Connection • Scaled Dot-Product Attention • Vision Transformer

Edit Social Preview

MogaNet: Multi-order Gated Aggregation Network

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove