Instance Segmentation

977 papers with code • 25 benchmarks • 83 datasets

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Benchmarks

Add a Result

These leaderboards are used to track progress in Instance Segmentation

Dataset	Best Model	Compare
COCO test-dev	EVA	See all
COCO minival	InternImage-H	See all
LVIS v1.0 val	Co-DETR (single-scale)	See all
Cityscapes val	OneFormer (InternImage-H, emb_dim=256, single-scale)	See all
ADE20K val	OneFormer (InternImage-H, emb_dim=1024, single-scale, 896x896, COCO-Pretrained)	See all
Cityscapes test	Deep Watershed Transform	See all
Occluded COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
Separated COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
iSAID	PANet++	See all
TBBR	Swin-T (ImageNet-1k pretrain)	See all
COCO 2017 val	SparK (ConvNeXt V1-B Mask R-CNN)	See all
BDD100K val	Mask Transfiner	See all
COCO val (panoptic labels)	OneFormer (InternImage-H, emb_dim=1024, single-scale)	See all
UIIS	WaterMask RCNN	See all
NYU Depth v2	SGPN-CNN	See all
KINS	BCNet	See all
nuScenes	TraDeS	See all
coco minval	R3-CNN (ResNet-50-FPN, GC-Net)	See all
Leaf Segmentation Challenge	LeafMask	See all
iShape	ASIS(baseline)	See all
LVIS v1.0 test-dev	R50-FPN-MaskRCNN-TTA	See all
PartNet	Semantic Segmentation-Assisted Instance Feature Fusion	See all
COCO val2017	MogaNet-S (256x192)	See all
Object Detection on COCO minival	DaViT-T (Mask R-CNN, 36 epochs)	See all
LDD	R^3-CNN	See all

Show all 25 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Instance Segmentation models and implementations

open-mmlab/mmdetection

31 papers

27,976

PaddlePaddle/PaddleDetection

18 papers

12,149

rwightman/pytorch-image-models

17 papers

29,974

huggingface/transformers

6 papers

126,027

See all 20 libraries.

Datasets

Subtasks

Unsupervised Object Segmentation

Amodal Instance Segmentation

Box-supervised Instance Segmentation

Unseen Object Instance Segmentation

Image-level Supervised Instance Segmentation

3D Semantic Instance Segmentation

Open-World Instance Segmentation

Human Instance Segmentation

One-Shot Instance Segmentation

Semi-Supervised Person Instance Segmentation

Point-Supervised Instance Segmentation

Solar Cell Segmentation

Most implemented papers

Most implemented Social Latest No code

Mask R-CNN

tensorflow/models • • ICCV 2017

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

172

Paper
Code

MMDetection: Open MMLab Detection Toolbox and Benchmark

open-mmlab/mmdetection • • 17 Jun 2019

In this paper, we introduce the various features of this toolbox.

144

Paper
Code

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

microsoft/Swin-Transformer • • ICCV 2021

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Paper
Code

YOLACT: Real-time Instance Segmentation

dbolya/yolact • • ICCV 2019

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Paper
Code

Deep High-Resolution Representation Learning for Visual Recognition

open-mmlab/mmdetection • • 20 Aug 2019

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection.

Paper
Code

Deep High-Resolution Representation Learning for Human Pose Estimation

leoxiaobin/deep-high-resolution-net.pytorch • • CVPR 2019

We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel.

Paper
Code

YOLACT++: Better Real-time Instance Segmentation

dbolya/yolact • • 3 Dec 2019

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Paper
Code

Microsoft COCO: Common Objects in Context

PaddlePaddle/PaddleDetection • • 1 May 2014

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.

Paper
Code