Visual Prompting

30 papers with code • 0 benchmarks • 0 datasets

Visual Prompting is the task of streamlining computer vision processes by harnessing the power of prompts, inspired by the breakthroughs of text prompting in NLP. This innovative approach involves using a few visual prompts to swiftly convert an unlabeled dataset into a deployed model, significantly reducing development time for both individual projects and enterprise solutions.

Most implemented papers

Segment Anything

facebookresearch/segment-anything ICCV 2023

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation.

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V

microsoft/SoM 17 Oct 2023

We present Set-of-Mark (SoM), a new visual prompting method, to unleash the visual grounding abilities of large multimodal models (LMMs), such as GPT-4V.

Visual In-Context Prompting

ux-decoder/dinov 22 Nov 2023

In-context prompting in large language models (LLMs) has become a prevalent approach to improve zero-shot capabilities, but this idea is less explored in the vision domain.

Visual Prompting for Adversarial Robustness

Phoveran/vp-for-adversarial-robustness 12 Oct 2022

In this work, we leverage visual prompting (VP) to improve adversarial robustness of a fixed, pre-trained model at testing time.

Explicit Visual Prompting for Universal Foreground Segmentations

nifangbaage/explicit-visual-prompt 29 May 2023

We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP).

Exploring Visual Prompts for Adapting Large-Scale Models

hjbahng/visual_prompting 31 Mar 2022

The surprising effectiveness of visual prompting provides a new perspective on adapting pre-trained models in vision.

Visual Prompting via Image Inpainting

amirbar/visual_prompting 1 Sep 2022

How does one adapt a pre-trained visual model to novel downstream tasks without task-specific finetuning or any model modification?

Understanding and Improving Visual Prompting: A Label-Mapping Perspective

optml-group/ilm-vp CVPR 2023

As highlighted below, we show that when reprogramming an ImageNet-pretrained ResNet-18 to 13 target tasks, our method outperforms baselines by a substantial margin, e. g., 7. 9% and 6. 7% accuracy improvements in transfer learning to the target Flowers102 and CIFAR100 datasets.

Unleashing the Power of Visual Prompting At the Pixel Level

ucsc-vlaa/evp 20 Dec 2022

This paper presents a simple and effective visual prompting method for adapting pre-trained models to downstream recognition tasks.

Text-Visual Prompting for Efficient 2D Temporal Video Grounding

intel/TVP CVPR 2023

In this paper, we study the problem of temporal video grounding (TVG), which aims to predict the starting/ending time points of moments described by a text sentence within a long untrimmed video.