DETReg: Unsupervised Pretraining with Region Priors for Object Detection

Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture. Instead, we introduce DETReg, a new self-supervised method that pretrains the entire object detection network, including the object localization and embedding components. During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator and simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder. We implement DETReg using the DETR family of detectors and show that it improves over competitive baselines when finetuned on COCO, PASCAL VOC, and Airbus Ship benchmarks. In low-data regimes, including semi-supervised and few-shot learning settings, DETReg establishes many state-of-the-art results, e.g., on COCO we see a +6.0 AP improvement for 10-shot detection and over 2 AP improvements when training with only 1\% of the labels. For code and pretrained models, visit the project page at

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semi-Supervised Object Detection COCO 10% labeled data DETReg mAP 29.12±0.2 # 22
Semi-Supervised Object Detection COCO 1% labeled data DETReg mAP 14.58 ± 0.3 # 18
Few-Shot Object Detection COCO 2017 DETReg (ours) AP 30 # 1
Semi-Supervised Object Detection COCO 2% labeled data DETReg mAP 18.69±0.2 # 14
Semi-Supervised Object Detection COCO 5% labeled data DETReg mAP 24.80±0.2 # 19
Few-Shot Object Detection MS-COCO (10-shot) DETReg-ft-full DDETR AP 25 # 1
Few-Shot Object Detection MS-COCO (30-shot) DETReg-ft-full DDETR AP 30 # 2
Object Detection PASCAL VOC 10% DETReg (ours) AP 51.4 # 2
AP50 72.2 # 2
AP75 56.6 # 2
Object Detection PASCAL VOC 2007 DETReg (ours) AP50 83.3 # 2
AP 63.5 # 2
AP75 70.3 # 2