1 code implementation • 3 Nov 2023 • Kevin Vogt-Lowell, Noah Lee, Theodoros Tsiligkaridis, Marc Vaillant
To address these gaps, we present a new recipe for few-shot fine-tuning of the popular vision-language foundation model CLIP and evaluate its performance on challenging benchmark datasets with realistic distribution shifts from the WILDS collection.