CLIP-Guided Source-Free Object Detection in Aerial Images

10 Jan 2024  ·  Nanqing Liu, Xun Xu, Yongyi Su, Chengxin Liu, Peiliang Gong, Heng-Chao Li ·

Domain adaptation is crucial in aerial imagery, as the visual representation of these images can significantly vary based on factors such as geographic location, time, and weather conditions. Additionally, high-resolution aerial images often require substantial storage space and may not be readily accessible to the public. To address these challenges, we propose a novel Source-Free Object Detection (SFOD) method. Specifically, our approach begins with a self-training framework, which significantly enhances the performance of baseline methods. To alleviate the noisy labels in self-training, we utilize Contrastive Language-Image Pre-training (CLIP) to guide the generation of pseudo-labels, termed CLIP-guided Aggregation (CGA). By leveraging CLIP's zero-shot classification capability, we aggregate its scores with the original predicted bounding boxes, enabling us to obtain refined scores for the pseudo-labels. To validate the effectiveness of our method, we constructed two new datasets from different domains based on the DIOR dataset, named DIOR-C and DIOR-Cloudy. Experimental results demonstrate that our method outperforms other comparative algorithms. The code is available at https://github.com/Lans1ng/SFOD-RS.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here