DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

29 Mar 2024  ยท  Yikang Zhou, Tao Zhang, Shunping Ji, Shuicheng Yan, Xiangtai Li ยท

Modern video segmentation methods adopt object queries to perform inter-frame association and demonstrate satisfactory performance in tracking continuously appearing objects despite large-scale motion and transient occlusion. However, they all underperform on newly emerging and disappearing objects that are common in the real world because they attempt to model object emergence and disappearance through feature transitions between background and foreground queries that have significant feature gaps. We introduce Dynamic Anchor Queries (DAQ) to shorten the transition gap between the anchor and target queries by dynamically generating anchor queries based on the features of potential candidates. Furthermore, we introduce a query-level object Emergence and Disappearance Simulation (EDS) strategy, which unleashes DAQ's potential without any additional cost. Finally, we combine our proposed DAQ and EDS with DVIS to obtain DVIS-DAQ. Extensive experiments demonstrate that DVIS-DAQ achieves a new state-of-the-art (SOTA) performance on five mainstream video segmentation benchmarks. Code and models are available at \url{https://github.com/SkyworkAI/DAQ-VS}.

PDF Abstract

Results from the Paper


 Ranked #1 on Video Instance Segmentation on OVIS validation (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Video Instance Segmentation OVIS validation DVIS-DAQ(VIT-L, Offline) mask AP 57.1 # 1
AP50 83.8 # 1
AP75 62.9 # 2
Video Instance Segmentation YouTube-VIS 2021 DVIS-DAQ(VIT-L, Offline) mask AP 64.5 # 2
AP50 86.1 # 3
AP75 72.2 # 2
AR10 70.7 # 1
AR1 49.6 # 2
Video Instance Segmentation YouTube-VIS validation DVIS-DAQ(VIT-L, Offline) mask AP 69.2 # 2
AP50 90.8 # 2
AP75 76.8 # 2
AR1 58.4 # 1
AR10 75.5 # 1

Methods


No methods listed for this paper. Add relevant methods here