Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

11 Dec 2023  ยท  Yubin Wang, Xinyang Jiang, De Cheng, Dongsheng Li, Cairong Zhao ยท

Prompt learning has become a prevalent strategy for adapting vision-language foundation models to downstream tasks. As large language models (LLMs) have emerged, recent studies have explored the use of category-related descriptions as input to enhance prompt effectiveness. Nevertheless, conventional descriptions fall short of structured information that effectively represents the interconnections among entities or attributes linked to a particular category. To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations. Preexisting prompt tuning methods exhibit inadequacies in managing this structured knowledge. Consequently, we propose a novel approach called Hierarchical Prompt Tuning (HPT), which enables simultaneous modeling of both structured and conventional linguistic knowledge. Specifically, we introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning. In addition, by incorporating high-level and global-level prompts modeling overall semantics, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships. Extensive experiments demonstrate that our HPT shows strong effectiveness and generalizes much better than existing SOTA methods. Our code is available at https://github.com/Vill-Lab/2024-AAAI-HPT.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Prompt Engineering Caltech-101 HPT Harmonic mean 96.65 # 2
Prompt Engineering DTD HPT Harmonic mean 72.16 # 4
Prompt Engineering EuroSAT HPT Harmonic mean 84.82 # 4
Prompt Engineering FGVC-Aircraft HPT Harmonic mean 40.28 # 2
Prompt Engineering Food-101 HPT Harmonic mean 91.01 # 7
Prompt Engineering ImageNet HPT Harmonic mean 74.17 # 4
Prompt Engineering ImageNet-A HPT Top-1 accuracy % 50.85 # 4
Prompt Engineering ImageNet-R HPT Top-1 accuracy % 77.38 # 4
Prompt Engineering ImageNet-S HPT Top-1 accuracy % 49.36 # 4
Prompt Engineering ImageNet V2 HPT Top-1 accuracy % 65.25 # 1
Prompt Engineering Oxford 102 Flower HPT Harmonic mean 87.16 # 2
Prompt Engineering Oxford-IIIT Pet Dataset HPT Harmonic mean 96.71 # 3
Prompt Engineering Stanford Cars HPT Harmonic mean 75.57 # 5
Prompt Engineering SUN397 HPT Harmonic mean 80.88 # 3
Prompt Engineering UCF101 HPT Harmonic mean 83.16 # 3

Methods


No methods listed for this paper. Add relevant methods here