Instance Segmentation

943 papers with code • 24 benchmarks • 81 datasets

Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.

Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21

Libraries

Use these libraries to find Instance Segmentation models and implementations

Most implemented papers

Mask R-CNN

tensorflow/models ICCV 2017

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

MMDetection: Open MMLab Detection Toolbox and Benchmark

open-mmlab/mmdetection 17 Jun 2019

In this paper, we introduce the various features of this toolbox.

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

microsoft/Swin-Transformer ICCV 2021

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

YOLACT: Real-time Instance Segmentation

dbolya/yolact ICCV 2019

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

Deep High-Resolution Representation Learning for Visual Recognition

open-mmlab/mmdetection 20 Aug 2019

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection.

Deep High-Resolution Representation Learning for Human Pose Estimation

leoxiaobin/deep-high-resolution-net.pytorch CVPR 2019

We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel.

YOLACT++: Better Real-time Instance Segmentation

dbolya/yolact 3 Dec 2019

Then we produce instance masks by linearly combining the prototypes with the mask coefficients.

ResNeSt: Split-Attention Networks

zhanghang1989/ResNeSt 19 Apr 2020

It is well known that featuremap attention and multi-path representation are important for visual recognition.

Microsoft COCO: Common Objects in Context

PaddlePaddle/PaddleDetection 1 May 2014

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.

Non-local Neural Networks

facebookresearch/video-nonlocal-net CVPR 2018

Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time.