Consistency-guided Prompt Learning for Vision-Language Models

1 Jun 2023  ·  Shuvendu Roy, Ali Etemad ·

We propose Consistency-guided Prompt learning (CoPrompt), a new fine-tuning method for vision-language models. Our approach improves the generalization of large foundation models when fine-tuned on downstream tasks in a few-shot setting. The basic idea of CoPrompt is to enforce a consistency constraint in the prediction of the trainable and pre-trained models to prevent overfitting on the downstream task. Additionally, we introduce the following two components into our consistency constraint to further boost the performance: enforcing consistency on two perturbed inputs and combining two dominant paradigms of tuning, prompting and adapter. Enforcing consistency on perturbed input serves to further regularize the consistency constraint, thereby improving generalization. Moreover, the integration of adapters and prompts not only enhances performance on downstream tasks but also offers increased tuning flexibility in both input and output spaces. This facilitates more effective adaptation to downstream tasks in a few-shot learning setting. Experiments show that CoPrompt outperforms existing methods on a range of evaluation suites, including base-to-novel generalization, domain generalization, and cross-dataset evaluation. On generalization, CoPrompt improves the state-of-the-art on zero-shot tasks and the overall harmonic mean over 11 datasets. Detailed ablation studies show the effectiveness of each of the components in CoPrompt. We make our code available at https://github.com/ShuvenduRoy/CoPrompt.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Prompt Engineering Caltech-101 CoPrompt Harmonic mean 96.55 # 3
Prompt Engineering DTD CoPrompt Harmonic mean 72.79 # 2
Prompt Engineering EuroSAT CoPrompt Harmonic mean 85.84 # 2
Prompt Engineering FGVC-Aircraft CoPrompt Harmonic mean 39.76 # 5
Prompt Engineering Food-101 CoPrompt Harmonic mean 91.40 # 2
Prompt Engineering ImageNet CoPrompt Harmonic mean 74.33 # 3
Prompt Engineering ImageNet-A CoPrompt Top-1 accuracy % 50.50 # 6
Prompt Engineering ImageNet-R CoPrompt Top-1 accuracy % 77.51 # 3
Prompt Engineering ImageNet-S CoPrompt Top-1 accuracy % 49.43 # 3
Prompt Engineering Oxford 102 Flower CoPrompt Harmonic mean 85.71 # 5
Prompt Engineering Oxford-IIIT Pet Dataset CoPrompt Harmonic mean 96.87 # 2
Prompt Engineering Stanford Cars CoPrompt Harmonic mean 75.66 # 4
Prompt Engineering SUN397 CoPrompt Harmonic mean 81.31 # 2
Prompt Engineering UCF101 CoPrompt Harmonic mean 83.07 # 4

Methods


No methods listed for this paper. Add relevant methods here