Visual Prompting
50 papers with code • 0 benchmarks • 0 datasets
Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.
Benchmarks
These leaderboards are used to track progress in Visual Prompting
Most implemented papers
Segment Anything
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.
ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
Furthermore, we present ViP-Bench, a comprehensive benchmark to assess the capability of models in understanding visual prompts across multiple dimensions, enabling future research in this domain.
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
We present Set-of-Mark (SoM), a new visual prompting method, to unleash the visual grounding abilities of large multimodal models (LMMs), such as GPT-4V.
Visual In-Context Prompting
In-context prompting in large language models (LLMs) has become a prevalent approach to improve zero-shot capabilities, but this idea is less explored in the vision domain.
Visual Prompting for Adversarial Robustness
In this work, we leverage visual prompting (VP) to improve adversarial robustness of a fixed, pre-trained model at testing time.
Explicit Visual Prompting for Universal Foreground Segmentations
We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP).
Uncovering the Hidden Cost of Model Compression
This empirical investigation underscores the need for a nuanced understanding beyond mere accuracy in sparse and quantized settings, thereby paving the way for further exploration in Visual Prompting techniques tailored for sparse and quantized models.
Exploring Visual Prompts for Adapting Large-Scale Models
The surprising effectiveness of visual prompting provides a new perspective on adapting pre-trained models in vision.
Visual Prompting via Image Inpainting
How does one adapt a pre-trained visual model to novel downstream tasks without task-specific finetuning or any model modification?
Understanding and Improving Visual Prompting: A Label-Mapping Perspective
As highlighted below, we show that when reprogramming an ImageNet-pretrained ResNet-18 to 13 target tasks, our method outperforms baselines by a substantial margin, e. g., 7. 9% and 6. 7% accuracy improvements in transfer learning to the target Flowers102 and CIFAR100 datasets.