Panoptic Segmentation
214 papers with code • 24 benchmarks • 32 datasets
Panoptic Segmentation is a computer vision task that combines semantic segmentation and instance segmentation to provide a comprehensive understanding of the scene. The goal of panoptic segmentation is to segment the image into semantically meaningful parts or regions, while also detecting and distinguishing individual instances of objects within those regions. In a given image, every pixel is assigned a semantic label, and pixels belonging to "things" classes (countable objects with instances, like cars and people) are assigned unique instance IDs. ( Image credit: Detectron2 )
Libraries
Use these libraries to find Panoptic Segmentation models and implementationsMost implemented papers
Dilated Neighborhood Attention Transformer
These models typically employ localized attention mechanisms, such as the sliding-window Neighborhood Attention (NA) or Swin Transformer's Shifted Window Self Attention.
BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning
Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving.
Cityscapes-Panoptic-Parts and PASCAL-Panoptic-Parts datasets for Scene Understanding
In this technical report, we present two novel datasets for image scene understanding.
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Vision transformers have recently achieved competitive results across various vision tasks but still suffer from heavy computation costs when processing a large number of tokens.
FlexiViT: One Model for All Patch Sizes
Vision Transformers convert images to sequences by slicing them into patches.
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers
As a result, MaX-DeepLab shows a significant 7. 1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time.
Exemplar-Based Open-Set Panoptic Segmentation Network
We extend panoptic segmentation to the open-world and introduce an open-set panoptic segmentation (OPS) task.
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results.
Finite Scalar Quantization: VQ-VAE Made Simple
Each dimension is quantized to a small set of fixed values, leading to an (implicit) codebook given by the product of these sets.
Panoptic Video Scene Graph Generation
PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.