Visual Prompt Tuning

34 papers with code • 4 benchmarks • 1 datasets

Visual Prompt Tuning(VPT) only introduces a small amount of task-specific learnable parameters into the input space while freezing the entire pre-trained Transformer backbone during downstream training. In practice, these additional parameters are simply prepended into the input sequence of each Transformer layer and learned together with a linear head during fine-tuning. VPT is especially effective in the low-data regime, and maintains its advantage across data scales. Finally, VPT is competitive for a range of Transformer scales and designs (ViTBase/Large/Huge, Swin). Put together, the results suggest that VPT is one of the most effective ways of adapting ever-growing vision backbones.

Datasets


Most implemented papers

Visual Prompt Tuning

KMnP/vpt 23 Mar 2022

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning.

Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models

zyh16143998882/iccv23-idpt ICCV 2023

To conquer this limitation, we propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models.

Improving Visual Prompt Tuning by Gaussian Neighborhood Minimization for Long-Tailed Visual Recognition

keke921/gnm-pt 28 Oct 2024

To address this issue, we propose a novel method called Random SAM prompt tuning (RSAM-PT) to improve the model generalization, requiring only one-step gradient computation at each step.

Understanding Zero-Shot Adversarial Robustness for Large-Scale Models

cvlab-columbia/zsrobust4foundationmodel 14 Dec 2022

We apply this training loss to two adaption methods, model finetuning and visual prompt tuning.

Dual Modality Prompt Tuning for Vision-Language Pre-Trained Model

fanrena/dpt 17 Aug 2022

To make the final image feature concentrate more on the target visual concept, a Class-Aware Visual Prompt Tuning (CAVPT) scheme is further proposed in our DPT, where the class-aware visual prompt is generated dynamically by performing the cross attention between text prompts features and image patch token embeddings to encode both the downstream task-related information and visual instance information.

Visual Prompt Tuning for Generative Transfer Learning

google-research/generative_transfer CVPR 2023

We base our framework on state-of-the-art generative vision transformers that represent an image as a sequence of visual tokens to the autoregressive or non-autoregressive transformers.

Unified Vision and Language Prompt Learning

yuhangzang/upt 13 Oct 2022

Prompt tuning, a parameter- and data-efficient transfer learning paradigm that tunes only a small number of parameters in a model's input space, has become a trend in the vision community since the emergence of large vision-language models like CLIP.

Multitask Vision-Language Prompt Tuning

sincerass/mvlpt 21 Nov 2022

Specifically, (i) we demonstrate the effectiveness of learning a single transferable prompt from multiple source tasks to initialize the prompt for each target task; (ii) we show many target tasks can benefit each other from sharing prompt vectors and thus can be jointly learned via multitask prompt tuning.

Learning Disentangled Prompts for Compositional Image Synthesis

cuixing100876/instastyle 1 Jun 2023

We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis.

Improving Visual Prompt Tuning for Self-supervised Vision Transformers

ryongithub/gatedprompttuning 8 Jun 2023

Visual Prompt Tuning (VPT) is an effective tuning method for adapting pretrained Vision Transformers (ViTs) to downstream tasks.