TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	ActiveMLP-L(UperNet)	Validation mIoU	51.1	# 93
Semantic Segmentation	ADE20K	ActiveMLP-L(UperNet)	Params (M)	108	# 29
Object Detection	COCO minival	ActiveMLP-B (Cascade Mask R-CNN)	box AP	52.3	# 64
Image Classification	ImageNet	ActiveMLP-L	Top 1 Accuracy	84.8%	# 270
Image Classification	ImageNet	ActiveMLP-L	Number of params	76.4M	# 801
Image Classification	ImageNet	ActiveMLP-L	GFLOPs	36.4	# 404
Image Classification	ImageNet	ActiveMLP-T	Top 1 Accuracy	82%	# 530
Image Classification	ImageNet	ActiveMLP-T	Number of params	27.2M	# 622
Image Classification	ImageNet	ActiveMLP-T	GFLOPs	4	# 191

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/activemlp-an-mlp-like-architecture-with/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=activemlp-an-mlp-like-architecture-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/activemlp-an-mlp-like-architecture-with/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=activemlp-an-mlp-like-architecture-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/activemlp-an-mlp-like-architecture-with/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=activemlp-an-mlp-like-architecture-with)`

Active Token Mixer

11 Mar 2022 · Guoqiang Wei, Zhizheng Zhang, Cuiling Lan, Yan Lu, Zhibo Chen ·

The three existing dominant network families, i.e., CNNs, Transformers, and MLPs, differ from each other mainly in the ways of fusing spatial contextual information, leaving designing more effective token-mixing mechanisms at the core of backbone architecture development. In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate flexible contextual information distributed across different channels from other tokens into the given query token. This fundamental operator actively predicts where to capture useful contexts and learns how to fuse the captured contexts with the query token at channel level. In this way, the spatial range of token-mixing can be expanded to a global scope with limited computational complexity, where the way of token-mixing is reformed. We take ATM as the primary operator and assemble ATMs into a cascade architecture, dubbed ATMNet. Extensive experiments demonstrate that ATMNet is generally applicable and comprehensively surpasses different families of SOTA vision backbones by a clear margin on a broad range of vision tasks, including visual recognition and dense prediction tasks. Code is available at https://github.com/microsoft/ActiveMLP.

PDF Abstract

Code

Add Remove Mark official

microsoft/activemlp official

122

microsoft/TokenMixers

122

Tasks

Add Remove

Image Classification

Instance Segmentation

Object Detection

Semantic Segmentation

Datasets

ImageNet

MS COCO

ADE20K

Results from the Paper

Edit

Ranked #64 on Object Detection on COCO minival

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	ActiveMLP-L(UperNet)	Validation mIoU	51.1	# 93	Compare
Semantic Segmentation	ADE20K	ActiveMLP-L(UperNet)	Params (M)	108	# 29	Compare
Object Detection	COCO minival	ActiveMLP-B (Cascade Mask R-CNN)	box AP	52.3	# 64	Compare
Image Classification	ImageNet	ActiveMLP-L	Top 1 Accuracy	84.8%	# 270	Compare
			Number of params	76.4M	# 801	Compare
			GFLOPs	36.4	# 404	Compare
Image Classification	ImageNet	ActiveMLP-T	Top 1 Accuracy	82%	# 530	Compare
			Number of params	27.2M	# 622	Compare
			GFLOPs	4	# 191	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Active Token Mixer

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove