Instance Segmentation
960 papers with code • 25 benchmarks • 82 datasets
Instance Segmentation is a computer vision task that involves identifying and separating individual objects within an image, including detecting the boundaries of each object and assigning a unique label to each object. The goal of instance segmentation is to produce a pixel-wise segmentation map of the image, where each pixel is assigned to a specific object instance.
Image Credit: Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers, CVPR'21
Libraries
Use these libraries to find Instance Segmentation models and implementationsDatasets
Subtasks
- Referring Expression Segmentation
- 3D Instance Segmentation
- Real-time Instance Segmentation
- Unsupervised Object Segmentation
- Unsupervised Object Segmentation
- Amodal Instance Segmentation
- Box-supervised Instance Segmentation
- Image-level Supervised Instance Segmentation
- Unseen Object Instance Segmentation
- 3D Semantic Instance Segmentation
- Open-World Instance Segmentation
- Human Instance Segmentation
- One-Shot Instance Segmentation
- Semi-Supervised Person Instance Segmentation
- Point-Supervised Instance Segmentation
- Solar Cell Segmentation
Latest papers
NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer
In the last few years, a handful of machine learning approaches for osteoclast image analysis have been developed, but none have addressed the full instance segmentation task required to produce the same output as that of the human expert led process.
ViM-UNet: Vision Mamba for Biomedical Segmentation
Here, we introduce ViM-UNet, a novel segmentation architecture based on it and compare it to UNet and UNETR for two challenging microscopy instance segmentation tasks.
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning
Panoptic segmentation, combining semantic and instance segmentation, stands as a cutting-edge computer vision task.
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures.
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information.
Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer
Transformers used in vision have been investigated through diverse architectures - ViT, PVT, and Swin.
BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation
To generate higher quality pseudo-labels and achieve more precise weakly supervised 3DIS results, we propose the Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation (BSNet), which devises a novel pseudo-labeler called Simulation-assisted Transformer.
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining
However, transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation
Given a set of initial queries, class-agnostic mask generation employs a transformer decoder to predict query masks and corresponding object scores and mask IoU scores.
Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery
We then introduce a novel cross-view instance label grouping strategy based on the 3D scene representation to mitigate the multi-view inconsistency problem in the 2D instance labels.