Search Results for author: Noah A Smith

Found 3 papers, 2 papers with code

Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

no code implementations13 Jun 2024 Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna

In this work, we introduce Sketchpad, a framework that gives multimodal LMs a visual sketchpad and tools to draw on the sketchpad.

Math object-detection +2

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

1 code implementation ICCV 2023 Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A Smith

We introduce TIFA (Text-to-Image Faithfulness evaluation with question Answering), an automatic evaluation metric that measures the faithfulness of a generated image to its text input via visual question answering (VQA).

4k Language Modelling +4

PromptCap: Prompt-Guided Task-Aware Image Captioning

1 code implementation15 Nov 2022 Yushi Hu, Hang Hua, Zhengyuan Yang, Weijia Shi, Noah A Smith, Jiebo Luo

PromptCap outperforms generic captions by a large margin and achieves state-of-the-art accuracy on knowledge-based VQA tasks (60. 4% on OK-VQA and 59. 6% on A-OKVQA).

Image Captioning Language Modelling +5

Cannot find the paper you are looking for? You can Submit a new open access paper.