TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	Cityscapes val	HRNetV2-OCR+PSA	mIoU	86.93	# 4
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	AP	78.9	# 5
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	AP50	93.6	# 6
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	AP75	85.8	# 8
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	APL	83.6	# 6
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	APM	76.1	# 8
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	AR	81.4	# 13
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	AP	79.5	# 3
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	AP50	93.6	# 6
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	AP75	85.9	# 6
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	APL	84.3	# 4
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	APM	76.3	# 7
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	AR	81.9	# 9
Keypoint Detection	MS COCO	UDP-Pose-PSA(384x288)	Validation AP	79.5	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polarized-self-attention-towards-high-quality-1/keypoint-detection-on-coco)](https://paperswithcode.com/sota/keypoint-detection-on-coco?p=polarized-self-attention-towards-high-quality-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polarized-self-attention-towards-high-quality-1/pose-estimation-on-coco-test-dev)](https://paperswithcode.com/sota/pose-estimation-on-coco-test-dev?p=polarized-self-attention-towards-high-quality-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polarized-self-attention-towards-high-quality-1/semantic-segmentation-on-cityscapes-val)](https://paperswithcode.com/sota/semantic-segmentation-on-cityscapes-val?p=polarized-self-attention-towards-high-quality-1)`

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

arXiv preprint 2021 · Huajun Liu, Fuqiang Liu, Xinyi Fan, Dong Huang ·

Pixel-wise regression is probably the most common problem in fine-grained computer vision tasks, such as estimating keypoint heatmaps and segmentation masks. These regression problems are very challenging particularly because they require, at low computation overheads, modeling long-range dependencies on high-resolution inputs/outputs to estimate the highly nonlinear pixel-wise semantics. While attention mechanisms in Deep Convolutional Neural Networks(DCNNs) has become popular for boosting long-range dependencies, element-specific attention, such as Nonlocal blocks, is highly complex and noise-sensitive to learn, and most of simplified attention hybrids try to reach the best compromise among multiple types of tasks. In this paper, we present the Polarized Self-Attention(PSA) block that incorporates two critical designs towards high-quality pixel-wise regression: (1) Polarized filtering: keeping high internal resolution in both channel and spatial attention computation while completely collapsing input tensors along their counterpart dimensions. (2) Enhancement: composing non-linearity that directly fits the output distribution of typical fine-grained regression, such as the 2D Gaussian distribution (keypoint heatmaps), or the 2D Binormial distribution (binary segmentation masks). PSA appears to have exhausted the representation capacity within its channel-only and spatial-only branches, such that there is only marginal metric differences between its sequential and parallel layouts. Experimental results show that PSA boosts standard baselines by $2-4$ points, and boosts state-of-the-arts by $1-2$ points on 2D pose estimation and semantic segmentation benchmarks.

PDF Abstract arXiv preprint 2021 PDF arXiv preprint 2021 Abstract

Code

Add Remove Mark official

DeLightCMU/PSA official

236

PaddlePaddle/PaddleSeg

8,252

xmu-xiaoma666/External-Attention-py…

1,552

sithu31296/semantic-segmentation

↳ Quickstart in

Colab

756

sithu31296/pose-estimation

See all 6 implementations

Tasks

Add Remove

2D Pose Estimation

Keypoint Detection

Pose Estimation

regression

Segmentation

Semantic Segmentation

Vocal Bursts Intensity Prediction

Datasets

MS COCO

Cityscapes

Results from the Paper

Edit

Ranked #2 on Keypoint Detection on MS COCO (Validation AP metric)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	Cityscapes val	HRNetV2-OCR+PSA	mIoU	86.93	# 4	Compare
Pose Estimation	COCO test-dev	UDP-Pose-PSA(256x192)	AP	78.9	# 5	Compare
			AP50	93.6	# 6	Compare
			AP75	85.8	# 8	Compare
			APL	83.6	# 6	Compare
			APM	76.1	# 8	Compare
			AR	81.4	# 13	Compare
Pose Estimation	COCO test-dev	UDP-Pose-PSA(384x288)	AP	79.5	# 3	Compare
			AP50	93.6	# 6	Compare
			AP75	85.9	# 6	Compare
			APL	84.3	# 4	Compare
			APM	76.3	# 7	Compare
			AR	81.9	# 9	Compare
Keypoint Detection	MS COCO	UDP-Pose-PSA(384x288)	Validation AP	79.5	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Polarized Self-Attention: Towards High-quality Pixel-wise Regression

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove