TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Few-Shot Semantic Segmentation	COCO-20i (1-shot)	FPTrans (ViT-B/16)	Mean IoU	42	# 43
Few-Shot Semantic Segmentation	COCO-20i (1-shot)	FPTrans (DeiT-B/16)	Mean IoU	47	# 12
Few-Shot Semantic Segmentation	COCO-20i (5-shot)	FPTrans (ViT-B/16)	Mean IoU	53.8	# 15
Few-Shot Semantic Segmentation	COCO-20i (5-shot)	FPTrans (DeiT-B/16)	Mean IoU	58.9	# 3
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (1-shot)	FPTrans (DeiT-B/16)	Mean IoU	69.7	# 2
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (1-shot)	FPTrans (ViT-B/16)	Mean IoU	67.6	# 5
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (5-shot)	FPTrans (DeiT-B/16)	Mean IoU	79.3	# 1
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (5-shot)	FPTrans (ViT-B/16)	Mean IoU	76.9	# 4
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FPTrans (DeiT-B/16)	Mean IoU	68.8	# 13
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FPTrans (ViT-B/16)	Mean IoU	64.7	# 49
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FPTrans (DeiT-B/16)	Mean IoU	78	# 3
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FPTrans (ViT-B/16)	Mean IoU	73.7	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-coco-20i-2)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-coco-20i-2?p=feature-proxy-transformer-for-few-shot)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-coco-20i)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-coco-20i?p=feature-proxy-transformer-for-few-shot)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-coco-20i-5)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-coco-20i-5?p=feature-proxy-transformer-for-few-shot)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-pascal-5i-5)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-pascal-5i-5?p=feature-proxy-transformer-for-few-shot)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-coco-20i-1)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-coco-20i-1?p=feature-proxy-transformer-for-few-shot)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/feature-proxy-transformer-for-few-shot/few-shot-semantic-segmentation-on-pascal-5i-1)](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-pascal-5i-1?p=feature-proxy-transformer-for-few-shot)`

Feature-Proxy Transformer for Few-Shot Segmentation

13 Oct 2022 · Jian-Wei Zhang, Yifan Sun, Yi Yang, Wei Chen ·

Few-shot segmentation (FSS) aims at performing semantic segmentation on novel classes given a few annotated support samples. With a rethink of recent advances, we find that the current FSS framework has deviated far from the supervised segmentation framework: Given the deep features, FSS methods typically use an intricate decoder to perform sophisticated pixel-wise matching, while the supervised segmentation methods use a simple linear classification head. Due to the intricacy of the decoder and its matching pipeline, it is not easy to follow such an FSS framework. This paper revives the straightforward framework of "feature extractor $+$ linear classification head" and proposes a novel Feature-Proxy Transformer (FPTrans) method, in which the "proxy" is the vector representing a semantic class in the linear classification head. FPTrans has two keypoints for learning discriminative features and representative proxies: 1) To better utilize the limited support samples, the feature extractor makes the query interact with the support features from the bottom to top layers using a novel prompting strategy. 2) FPTrans uses multiple local background proxies (instead of a single one) because the background is not homogeneous and may contain some novel foreground regions. These two keypoints are easily integrated into the vision transformer backbone with the prompting mechanism in the transformer. Given the learned features and proxies, FPTrans directly compares their cosine similarity for segmentation. Although the framework is straightforward, we show that FPTrans achieves competitive FSS accuracy on par with state-of-the-art decoder-based methods.

PDF Abstract

Code

Add Remove Mark official

jarvis73/fptrans official

Jarvis73/FPTransPaddle official

Tasks

Add Remove

Few-Shot Semantic Segmentation

Segmentation

Semantic Segmentation

Datasets

MS COCO

PASCAL-5i

Results from the Paper

Edit

Ranked #1 on Few-Shot Semantic Segmentation on COCO-20i -> Pascal VOC (5-shot)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Few-Shot Semantic Segmentation	COCO-20i (1-shot)	FPTrans (ViT-B/16)	Mean IoU	42	# 43	Compare
Few-Shot Semantic Segmentation	COCO-20i (1-shot)	FPTrans (DeiT-B/16)	Mean IoU	47	# 12	Compare
Few-Shot Semantic Segmentation	COCO-20i (5-shot)	FPTrans (ViT-B/16)	Mean IoU	53.8	# 15	Compare
Few-Shot Semantic Segmentation	COCO-20i (5-shot)	FPTrans (DeiT-B/16)	Mean IoU	58.9	# 3	Compare
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (1-shot)	FPTrans (DeiT-B/16)	Mean IoU	69.7	# 2	Compare
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (1-shot)	FPTrans (ViT-B/16)	Mean IoU	67.6	# 5	Compare
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (5-shot)	FPTrans (DeiT-B/16)	Mean IoU	79.3	# 1	Compare
Few-Shot Semantic Segmentation	COCO-20i -> Pascal VOC (5-shot)	FPTrans (ViT-B/16)	Mean IoU	76.9	# 4	Compare
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FPTrans (DeiT-B/16)	Mean IoU	68.8	# 13	Compare
Few-Shot Semantic Segmentation	PASCAL-5i (1-Shot)	FPTrans (ViT-B/16)	Mean IoU	64.7	# 49	Compare
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FPTrans (DeiT-B/16)	Mean IoU	78	# 3	Compare
Few-Shot Semantic Segmentation	PASCAL-5i (5-Shot)	FPTrans (ViT-B/16)	Mean IoU	73.7	# 9	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer • Vision Transformer

Edit Social Preview

Feature-Proxy Transformer for Few-Shot Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove