Instance Segmentation
981 papers with code • 25 benchmarks • 84 datasets
Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.
Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21
Libraries
Use these libraries to find Instance Segmentation models and implementationsDatasets
Subtasks
- Referring Expression Segmentation
- 3D Instance Segmentation
- Real-time Instance Segmentation
- Unsupervised Object Segmentation
- Unsupervised Object Segmentation
- Amodal Instance Segmentation
- Box-supervised Instance Segmentation
- Unseen Object Instance Segmentation
- Image-level Supervised Instance Segmentation
- 3D Semantic Instance Segmentation
- Open-World Instance Segmentation
- Human Instance Segmentation
- One-Shot Instance Segmentation
- Semi-Supervised Person Instance Segmentation
- Point-Supervised Instance Segmentation
- Solar Cell Segmentation
Most implemented papers
Rethinking Channel Dimensions for Efficient Model Design
We then investigate the channel configuration of a model by searching network architectures concerning the channel configuration under the computational cost restriction.
Panoptic Segmentation
We propose and study a task we name panoptic segmentation (PS).
GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
In this paper, we take advantage of this finding to create a simplified network based on a query-independent formulation, which maintains the accuracy of NLNet but with significantly less computation.
Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation
In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast system for panoptic segmentation, aiming to establish a solid baseline for bottom-up methods that can achieve comparable performance of two-stage methods while yielding fast inference speed.
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation
The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference.
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Unlike the recently-proposed Transformer model (e. g., ViT) that is specially designed for image classification, we propose Pyramid Vision Transformer~(PVT), which overcomes the difficulties of porting Transformer to various dense prediction tasks.
Co-Scale Conv-Attentional Image Transformers
In this paper, we present Co-scale conv-attentional image Transformers (CoaT), a Transformer-based image classifier equipped with co-scale and conv-attentional mechanisms.
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
In this paper we present Mask DINO, a unified object detection and segmentation framework.
RTMDet: An Empirical Study of Designing Real-Time Object Detectors
In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection.
Semantic Instance Segmentation with a Discriminative Loss Function
In this work we propose to tackle the problem with a discriminative loss function, operating at the pixel level, that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step.