Search Results for author: Zhiqiu Lin

We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models such that a non-expert user can define a new task depending on their needs via a few labeled examples and minimal domain knowledge.

Fine-Grained Image Classification Semantic Segmentation +1

Paper
Add Code

The CLEAR Benchmark: Continual LEArning on Real-World Imagery

1 code implementation • 17 Jan 2022 • Zhiqiu Lin, Jia Shi, Deepak Pathak, Deva Ramanan

The major strength of CLEAR over prior CL benchmarks is the smooth temporal evolution of visual concepts with real-world imagery, including both high-quality labeled data along with abundant unlabeled samples per time period for continual semi-supervised learning.

Continual Learning Image Classification +2

Paper
Code

Continual Learning with Evolving Class Ontologies

no code implementations • 10 Oct 2022 • Zhiqiu Lin, Deepak Pathak, Yu-Xiong Wang, Deva Ramanan, Shu Kong

LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e. g., dog breeds that refine the previous ${\tt dog}$).

Class Incremental Learning Image Classification +3

Paper
Add Code

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

1 code implementation • CVPR 2023 • Zhiqiu Lin, Samuel Yu, Zhiyi Kuang, Deepak Pathak, Deva Ramanan

By repurposing class names as additional one-shot training samples, we achieve SOTA results with an embarrassingly simple linear classifier for vision-language adaptation.

Audio Classification Few-Shot Learning

233

Paper
Code

Revisiting the Role of Language Priors in Vision-Language Models

1 code implementation • 2 Jun 2023 • Zhiqiu Lin, Xinyue Chen, Deepak Pathak, Pengchuan Zhang, Deva Ramanan

Our first observation is that they can be repurposed for discriminative tasks (such as image-text retrieval) by simply computing the match score of generating a particular text string given an image.

Ranked #45 on Visual Reasoning on Winoground

Image-text matching Language Modelling +6

Paper
Code

Language Models as Black-Box Optimizers for Vision-Language Models

1 code implementation • 12 Sep 2023 • Shihong Liu, Zhiqiu Lin, Samuel Yu, Ryan Lee, Tiffany Ling, Deepak Pathak, Deva Ramanan

We highlight the advantage of conversational feedback that incorporates both positive and negative prompts, suggesting that LLMs can utilize the implicit gradient direction in textual feedback for a more efficient search.

Few-Shot Image Classification

Paper
Code

Prompting Scientific Names for Zero-Shot Species Recognition

no code implementations • 15 Oct 2023 • Shubham Parashar, Zhiqiu Lin, Yanan Li, Shu Kong

We find that common names are more likely to be included in CLIP's training set, and prompting them achieves 2$\sim$5 times higher accuracy on benchmarking datasets of fine-grained species recognition.

Benchmarking Zero-Shot Learning

Paper
Add Code

The Neglected Tails of Vision-Language Models

no code implementations • 23 Jan 2024 • Shubham Parashar, Zhiqiu Lin, Tian Liu, Xiangjue Dong, Yanan Li, Deva Ramanan, James Caverlee, Shu Kong

We address this by using large language models (LLMs) to count the number of pretraining texts that contain synonyms of these concepts.

Retrieval Zero-Shot Learning

Paper
Add Code

Evaluating Text-to-Visual Generation with Image-to-Text Generation

2 code implementations • 1 Apr 2024 • Zhiqiu Lin, Deepak Pathak, Baiqi Li, Jiayao Li, Xide Xia, Graham Neubig, Pengchuan Zhang, Deva Ramanan

For instance, the widely-used CLIPScore measures the alignment between a (generated) image and text prompt, but it fails to produce reliable scores for complex prompts involving compositions of objects, attributes, and relations.

Question Answering Text Generation +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.