Video Panoptic Segmentation
17 papers with code • 3 benchmarks • 4 datasets
Video Panoptic Segmentation is a computer vision task that extends panoptic segmentation by incorporating temporal dimension. That is, given a video sequence, the goal is to predict the semantic class of each pixel while consistently tracking object instances. Here, the pixels belonging to the same object instance should be assigned the same instance ID throughout the video sequence.
Libraries
Use these libraries to find Video Panoptic Segmentation models and implementationsMost implemented papers
TarViS: A Unified Approach for Target-based Video Segmentation
A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining.
DVIS: Decoupled Video Instance Segmentation Framework
The efficacy of the decoupling strategy relies on two crucial elements: 1) attaining precise long-term alignment outcomes via frame-by-frame association during tracking, and 2) the effective utilization of temporal information predicated on the aforementioned accurate alignment outcomes during refinement.
1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
In this report, we successfully validated the effectiveness of the decoupling strategy in video panoptic segmentation.
Tracking Anything with Decoupled Video Segmentation
To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation.
MaXTron: Mask Transformer with Trajectory Attention for Video Panoptic Segmentation
To alleviate the issue, we propose to adapt the trajectory attention for both the dense pixel features and object queries, aiming to improve the short-term and long-term tracking results, respectively.
DVIS++: Improved Decoupled Framework for Universal Video Segmentation
We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation (VSS), and video panoptic segmentation (VPS).
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Despite the recent advances in unified image segmentation (IS), developing a unified video segmentation (VS) model remains a challenge.