Instance Segmentation

960 papers with code • 25 benchmarks • 82 datasets

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Benchmarks

Add a Result

These leaderboards are used to track progress in Instance Segmentation

Dataset	Best Model	Compare
COCO test-dev	EVA	See all
COCO minival	InternImage-H	See all
LVIS v1.0 val	Co-DETR (single-scale)	See all
Cityscapes val	OneFormer (InternImage-H, emb_dim=256, single-scale)	See all
ADE20K val	OneFormer (InternImage-H, emb_dim=1024, single-scale, 896x896, COCO-Pretrained)	See all
Cityscapes test	Deep Watershed Transform	See all
Occluded COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
Separated COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
iSAID	PANet++	See all
TBBR	Swin-T (ImageNet-1k pretrain)	See all
COCO 2017 val	SparK (ConvNeXt V1-B Mask R-CNN)	See all
BDD100K val	Mask Transfiner	See all
COCO val (panoptic labels)	OneFormer (InternImage-H, emb_dim=1024, single-scale)	See all
UIIS	WaterMask RCNN	See all
NYU Depth v2	SGPN-CNN	See all
KINS	BCNet	See all
nuScenes	TraDeS	See all
coco minval	R3-CNN (ResNet-50-FPN, GC-Net)	See all
Leaf Segmentation Challenge	LeafMask	See all
iShape	ASIS(baseline)	See all
LVIS v1.0 test-dev	R50-FPN-MaskRCNN-TTA	See all
PartNet	Semantic Segmentation-Assisted Instance Feature Fusion	See all
COCO val2017	MogaNet-S (256x192)	See all
Object Detection on COCO minival	DaViT-T (Mask R-CNN, 36 epochs)	See all
LDD	R^3-CNN	See all

Show all 25 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Instance Segmentation models and implementations

open-mmlab/mmdetection

31 papers

27,708

PaddlePaddle/PaddleDetection

18 papers

12,029

rwightman/pytorch-image-models

17 papers

29,671

huggingface/transformers

6 papers

124,527

See all 20 libraries.

Datasets

Subtasks

Unsupervised Object Segmentation

Amodal Instance Segmentation

Box-supervised Instance Segmentation

Image-level Supervised Instance Segmentation

Unseen Object Instance Segmentation

3D Semantic Instance Segmentation

Open-World Instance Segmentation

Human Instance Segmentation

One-Shot Instance Segmentation

Semi-Supervised Person Instance Segmentation

Point-Supervised Instance Segmentation

Solar Cell Segmentation

Most implemented papers

Most implemented Social Latest No code

Non-local Neural Networks

facebookresearch/video-nonlocal-net • • CVPR 2018

Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.

Paper
Code

SOLO: Segmenting Objects by Locations

open-mmlab/mmdetection • • ECCV 2020

We present a new, embarrassingly simple approach to instance segmentation in images.

Paper
Code

Deformable ConvNets v2: More Deformable, Better Results

open-mmlab/mmdetection • • CVPR 2019

The superior performance of Deformable Convolutional Networks arises from its ability to adapt to the geometric variations of objects.

Paper
Code

Towards End-to-End Lane Detection: an Instance Segmentation Approach

MaybeShewill-CV/lanenet-lane-detection • • 15 Feb 2018

By doing so, we ensure a lane fitting which is robust against road plane changes, unlike existing approaches that rely on a fixed, pre-defined transformation.

Paper
Code

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

NVIDIA/pix2pixHD • • CVPR 2018

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs).

Paper
Code

Swin Transformer V2: Scaling Up Capacity and Resolution

microsoft/Swin-Transformer • • CVPR 2022

Three main techniques are proposed: 1) a residual-post-norm method combined with cosine attention to improve training stability; 2) A log-spaced continuous position bias method to effectively transfer models pre-trained using low-resolution images to downstream tasks with high-resolution inputs; 3) A self-supervised pre-training method, SimMIM, to reduce the needs of vast labeled images.

Paper
Code

SOLOv2: Dynamic and Fast Instance Segmentation

WXinlong/SOLO • • NeurIPS 2020

Importantly, we take one step further by dynamically learning the mask head of the object segmenter such that the mask head is conditioned on the location.

Paper
Code

Visual Attention Network

Visual-Attention-Network/VAN-Classification • • 20 Feb 2022

In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings.

Paper
Code