Instance Segmentation

971 papers with code • 25 benchmarks • 83 datasets

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Benchmarks

Add a Result

These leaderboards are used to track progress in Instance Segmentation

Dataset	Best Model	Compare
COCO test-dev	EVA	See all
COCO minival	InternImage-H	See all
LVIS v1.0 val	Co-DETR (single-scale)	See all
Cityscapes val	OneFormer (InternImage-H, emb_dim=256, single-scale)	See all
ADE20K val	OneFormer (InternImage-H, emb_dim=1024, single-scale, 896x896, COCO-Pretrained)	See all
Cityscapes test	Deep Watershed Transform	See all
Occluded COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
Separated COCO	Swin-B + Cascade Mask R-CNN (tri-layer modelling)	See all
iSAID	PANet++	See all
TBBR	Swin-T (ImageNet-1k pretrain)	See all
COCO 2017 val	SparK (ConvNeXt V1-B Mask R-CNN)	See all
BDD100K val	Mask Transfiner	See all
COCO val (panoptic labels)	OneFormer (InternImage-H, emb_dim=1024, single-scale)	See all
UIIS	WaterMask RCNN	See all
NYU Depth v2	SGPN-CNN	See all
KINS	BCNet	See all
nuScenes	TraDeS	See all
coco minval	R3-CNN (ResNet-50-FPN, GC-Net)	See all
Leaf Segmentation Challenge	LeafMask	See all
iShape	ASIS(baseline)	See all
LVIS v1.0 test-dev	R50-FPN-MaskRCNN-TTA	See all
PartNet	Semantic Segmentation-Assisted Instance Feature Fusion	See all
COCO val2017	MogaNet-S (256x192)	See all
Object Detection on COCO minival	DaViT-T (Mask R-CNN, 36 epochs)	See all
LDD	R^3-CNN	See all

Show all 25 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Instance Segmentation models and implementations

open-mmlab/mmdetection

31 papers

27,845

PaddlePaddle/PaddleDetection

18 papers

12,085

rwightman/pytorch-image-models

17 papers

29,826

huggingface/transformers

6 papers

125,290

See all 20 libraries.

Datasets

Subtasks

Unsupervised Object Segmentation

Amodal Instance Segmentation

Box-supervised Instance Segmentation

Image-level Supervised Instance Segmentation

Unseen Object Instance Segmentation

3D Semantic Instance Segmentation

Open-World Instance Segmentation

Human Instance Segmentation

One-Shot Instance Segmentation

Semi-Supervised Person Instance Segmentation

Point-Supervised Instance Segmentation

Solar Cell Segmentation

Most implemented papers

Most implemented Social Latest No code

Is Heuristic Sampling Necessary in Training Deep Object Detectors?

facebookresearch/maskrcnn-benchmark • • 11 Sep 2019

In this paper, we challenge the necessity of such hard/soft sampling methods for training accurate deep object detectors.

Paper
Code

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

tensorflow/models • • CVPR 2020

We propose SpineNet, a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by Neural Architecture Search.

Paper
Code

Bottleneck Transformers for Visual Recognition

rwightman/pytorch-image-models • • CVPR 2021

Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84. 7% top-1 accuracy on the ImageNet benchmark while being up to 1. 64x faster in compute time than the popular EfficientNet models on TPU-v3 hardware.

Paper
Code

Efficient Attention: Attention with Linear Complexities

cmsflash/efficient-attention • • 4 Dec 2018

Dot-product attention has wide applications in computer vision and natural language processing.

Paper
Code

Panoptic Feature Pyramid Networks

facebookresearch/detectron2 • • CVPR 2019

In this work, we perform a detailed study of this minimally extended version of Mask R-CNN with FPN, which we refer to as Panoptic FPN, and show it is a robust and accurate baseline for both tasks.

Paper
Code

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

BangguWu/ECANet • • CVPR 2020

By dissecting the channel attention module in SENet, we empirically show avoiding dimensionality reduction is important for learning channel attention, and appropriate cross-channel interaction can preserve performance while significantly decreasing model complexity.

Paper
Code

UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation

MrGiovanni/UNetPlusPlus • • 11 Dec 2019

The state-of-the-art models for medical image segmentation are variants of U-Net and fully convolutional networks (FCN).

Paper
Code

XCiT: Cross-Covariance Image Transformers

rwightman/pytorch-image-models • • NeurIPS 2021

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Paper
Code

Path Aggregation Network for Instance Segmentation

ShuLiu1993/PANet • • CVPR 2018

The way that information propagates in neural networks is of great importance.

Paper
Code

Key Points Estimation and Point Instance Segmentation Approach for Lane Detection

koyeongmin/PINet • • 16 Feb 2020

In the case of traffic line detection, an essential perception module, many condition should be considered, such as number of traffic lines and computing power of the target system.

Paper
Code

Instance Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result