TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Object Detection	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	box mAP	48.1	# 94
Object Detection	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APS	28.7	# 58
Object Detection	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APM	50.4	# 47
Object Detection	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APL	61.5	# 38
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APS	27.8	# 62
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APM	48.7	# 63
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APL	59.9	# 52
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x)	box mAP	46.8	# 107
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x)	Hardware Burden	None	# 1
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x)	Operations per network pass	None	# 1
Instance Segmentation	COCO test-dev	ISTR-SMT (Swin-L, single scale)	mask AP	49.7	# 21
Instance Segmentation	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	mask AP	39.9%	# 71
Instance Segmentation	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APS	22.8	# 13
Instance Segmentation	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APM	41.9	# 22
Instance Segmentation	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	APL	52.3	# 24
Instance Segmentation	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	mask AP	38.6%	# 83
Instance Segmentation	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APS	22.1	# 19
Instance Segmentation	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APM	40.4	# 27
Instance Segmentation	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APL	50.6	# 30

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/istr-end-to-end-instance-segmentation-with/instance-segmentation-on-coco)](https://paperswithcode.com/sota/instance-segmentation-on-coco?p=istr-end-to-end-instance-segmentation-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/istr-end-to-end-instance-segmentation-with/object-detection-on-coco)](https://paperswithcode.com/sota/object-detection-on-coco?p=istr-end-to-end-instance-segmentation-with)`

ISTR: End-to-End Instance Segmentation with Transformers

3 May 2021 · Jie Hu, Liujuan Cao, Yao Lu, Shengchuan Zhang, Yan Wang, Ke Li, Feiyue Huang, Ling Shao, Rongrong Ji ·

End-to-end paradigms significantly improve the accuracy of various deep-learning-based computer vision models. To this end, tasks like object detection have been upgraded by replacing non-end-to-end components, such as removing non-maximum suppression by training with a set loss based on bipartite matching. However, such an upgrade is not applicable to instance segmentation, due to its significantly higher output dimensions compared to object detection. In this paper, we propose an instance segmentation Transformer, termed ISTR, which is the first end-to-end framework of its kind. ISTR predicts low-dimensional mask embeddings, and matches them with ground truth mask embeddings for the set loss. Besides, ISTR concurrently conducts detection and segmentation with a recurrent refinement strategy, which provides a new way to achieve instance segmentation compared to the existing top-down and bottom-up frameworks. Benefiting from the proposed end-to-end mechanism, ISTR demonstrates state-of-the-art performance even with approximation-based suboptimal embeddings. Specifically, ISTR obtains a 46.8/38.6 box/mask AP using ResNet50-FPN, and a 48.1/39.9 box/mask AP using ResNet101-FPN, on the MS COCO dataset. Quantitative and qualitative results reveal the promising potential of ISTR as a solid baseline for instance-level recognition. Code has been made available at: https://github.com/hujiecpp/ISTR.

PDF Abstract

Code

Add Remove Mark official

hujiecpp/ISTR official

200

Tasks

Add Remove

Instance Segmentation

object-detection

Object Detection

Segmentation

Semantic Segmentation

Datasets

MS COCO

Results from the Paper

Edit

Ranked #21 on Instance Segmentation on COCO test-dev

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Object Detection	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	box mAP	48.1	# 94	Compare
			APS	28.7	# 58	Compare
			APM	50.4	# 47	Compare
			APL	61.5	# 38	Compare
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	APS	27.8	# 62	Compare
			APM	48.7	# 63	Compare
			APL	59.9	# 52	Compare
Object Detection	COCO test-dev	ISTR (ResNet50-FPN-3x)	box mAP	46.8	# 107	Compare
			Hardware Burden	None	# 1	Compare
			Operations per network pass	None	# 1	Compare
Instance Segmentation	COCO test-dev	ISTR-SMT (Swin-L, single scale)	mask AP	49.7	# 21	Compare
Instance Segmentation	COCO test-dev	ISTR (ResNet101-FPN-3x, single-scale)	mask AP	39.9%	# 71	Compare
			APS	22.8	# 13	Compare
			APM	41.9	# 22	Compare
			APL	52.3	# 24	Compare
Instance Segmentation	COCO test-dev	ISTR (ResNet50-FPN-3x, single-scale)	mask AP	38.6%	# 83	Compare
			APS	22.1	# 19	Compare
			APM	40.4	# 27	Compare
			APL	50.6	# 30	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

ISTR: End-to-End Instance Segmentation with Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove