Search Results for author: Amita Kamath

Found 7 papers, 5 papers with code

What's "up" with vision-language models? Investigating their struggle with spatial reasoning

1 code implementation30 Oct 2023 Amita Kamath, Jack Hessel, Kai-Wei Chang

Recent vision-language (VL) models are powerful, but can they reliably distinguish "right" from "left"?

Text encoders bottleneck compositionality in contrastive vision-language models

1 code implementation24 May 2023 Amita Kamath, Jack Hessel, Kai-Wei Chang

We first curate CompPrompts, a set of increasingly compositional image captions that VL models should be able to capture (e. g., single object, to object+property, to multiple interacting objects).

Attribute Image Captioning +1

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

1 code implementation28 Mar 2023 Adyasha Maharana, Amita Kamath, Christopher Clark, Mohit Bansal, Aniruddha Kembhavi

As general purpose vision models get increasingly effective at a wide set of tasks, it is imperative that they be consistent across the tasks they support.

Webly Supervised Concept Expansion for General Purpose Vision Models

no code implementations4 Feb 2022 Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi

This work presents an effective and inexpensive alternative: learn skills from supervised datasets, learn concepts from web image search, and leverage a key characteristic of GPVs: the ability to transfer visual knowledge across skills.

Human-Object Interaction Detection Image Retrieval +4

Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture

no code implementations CVPR 2022 Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem

To reduce the time and expertise required to develop new applications, we would like to create general purpose vision systems that can learn and perform a range of tasks without any modification to the architecture or learning process.

Question Answering Visual Question Answering

Towards General Purpose Vision Systems

2 code implementations1 Apr 2021 Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem

To reduce the time and expertise required to develop new applications, we would like to create general purpose vision systems that can learn and perform a range of tasks without any modification to the architecture or learning process.

Question Answering Visual Question Answering

Selective Question Answering under Domain Shift

2 code implementations ACL 2020 Amita Kamath, Robin Jia, Percy Liang

In this work, we propose the setting of selective question answering under domain shift, in which a QA model is tested on a mixture of in-domain and out-of-domain data, and must answer (i. e., not abstain on) as many questions as possible while maintaining high accuracy.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.