Mask Frozen-DETR: High Quality Instance Segmentation with One GPU

7 Aug 2023  ·  Zhanhao Liang, Yuhui Yuan ·

In this paper, we aim to study how to build a strong instance segmenter with minimal training time and GPUs, as opposed to the majority of current approaches that pursue more accurate instance segmenter by building more advanced frameworks at the cost of longer training time and higher GPU requirements. To achieve this, we introduce a simple and general framework, termed Mask Frozen-DETR, which can convert any existing DETR-based object detection model into a powerful instance segmentation model. Our method only requires training an additional lightweight mask network that predicts instance masks within the bounding boxes given by a frozen DETR-based object detector. Remarkably, our method outperforms the state-of-the-art instance segmentation method Mask DINO in terms of performance on the COCO test-dev split (55.3% vs. 54.7%) while being over 10X times faster to train. Furthermore, all of our experiments can be trained using only one Tesla V100 GPU with 16 GB of memory, demonstrating the significant efficiency of our proposed framework.

PDF Abstract

Datasets


Results from the Paper


Ranked #3 on Instance Segmentation on COCO minival (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Instance Segmentation COCO minival Mask Frozen-DETR mask AP 54.9 # 3
AP50 78.9 # 3
AP75 60.8 # 3
APL 72.9 # 2
APM 58.4 # 1
APS 37.2 # 3
Instance Segmentation COCO test-dev Mask Frozen-DETR mask AP 55.3 # 3
AP50 79.3 # 3
AP75 61.4 # 2
APS 37.8 # 2
APM 58.4 # 2
APL 70.4 # 3

Methods