Visual Prompt Tuning

19 papers with code • 4 benchmarks • 0 datasets

Visual Prompt Tuning(VPT) only introduces a small amount of task-specific learnable parameters into the input space while freezing the entire pre-trained Transformer backbone during downstream training. In practice, these additional parameters are simply prepended into the input sequence of each Transformer layer and learned together with a linear head during fine-tuning. VPT is especially effective in the low-data regime, and maintains its advantage across data scales. Finally, VPT is competitive for a range of Transformer scales and designs (ViTBase/Large/Huge, Swin). Put together, the results suggest that VPT is one of the most effective ways of adapting ever-growing vision backbones.

Benchmarks

Add a Result

These leaderboards are used to track progress in Visual Prompt Tuning

Dataset	Best Model	Compare
FGVC	SPT-Deep(ViT-B/16_MoCo_v3_pretrained_ImageNet-1K)	See all
VTAB-1k(Natural<7>)	SPT-Deep(ViT-B/16_MoCo_v3_pretrained_ImageNet-1K)	See all
VTAB-1k(Specialized<4>)	SPT-Deep(ViT-B/16_MoCo_v3_pretrained_ImageNet-1K)	See all
VTAB-1k(Structured<8>)	SPT-Deep(ViT-B/16_MAE_pretrained_ImageNet-1K)	See all

Most implemented papers

Most implemented Social Latest No code

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

chenghan111/e2vpt • • ICCV 2023

Specifically, we introduce a set of learnable key-value prompts and visual prompts into self-attention and input layers, respectively, to improve the effectiveness of model fine-tuning.

Paper
Code

Online Class Incremental Learning on Stochastic Blurry Task Boundary via Mask and Visual Prompt Tuning

moonjunyyy/si-blurry • • ICCV 2023

In addition, to alleviate the class imbalance problem, we introduce a new gradient similarity-based focal loss and adaptive feature scaling to ease overfitting to the major classes and underfitting to the minor classes.

Paper
Code

Unlocking the Potential of Prompt-Tuning in Bridging Generalized and Personalized Federated Learning

ubc-tea/SGPT • • 27 Oct 2023

Existing Generalized FL (GFL) and Personalized FL (PFL) methods have limitations in balancing performance across both global and local data distributions.

Paper
Code

TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

tb2-sy/tsp-transformer • • 6 Nov 2023

Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc.

Paper
Code

SA$^2$VP: Spatially Aligned-and-Adapted Visual Prompt

tommy-xq/sa2vp • • 16 Dec 2023

Typical methods for visual prompt tuning follow the sequential modeling paradigm stemming from NLP, which represents an input image as a flattened sequence of token embeddings and then learns a set of unordered parameterized tokens prefixed to the sequence representation as the visual prompts for task adaptation of large vision models.

Paper
Code

Revisiting the Power of Prompt for Visual Tuning

wangyz1608/self-prompt-tuning • • 4 Feb 2024

Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.

Paper
Code

CoLLaVO: Crayon Large Language and Vision mOdel

ByungKwanLee/CoLLaVO • • 17 Feb 2024

Our findings reveal that the image understanding capabilities of current VLMs are strongly correlated with their zero-shot performance on vision language (VL) tasks.

Paper
Code

CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning

Cuzyoung/CoDA • • 26 Mar 2024

SAVPT features a novel metric Severity that divides all adverse scene images into low-severity and high-severity images.

Paper
Code

ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning

clovaai/ECLIPSE • • 29 Mar 2024

Panoptic segmentation, combining semantic and instance segmentation, stands as a cutting-edge computer vision task.

Paper
Code

Visual Prompt Tuning

Benchmarks Add a Result

Most implemented papers

Content

Benchmarks

Add a Result