Search Results for author: Michael Cogswell

Found 20 papers, 8 papers with code

Comprehension Based Question Answering using Bloom’s Taxonomy

no code implementations • ACL (RepL4NLP) 2021 • Pritish Sahu, Michael Cogswell, Ajay Divakaran, Sara Rutherford-Quach

Current pre-trained language models have lots of knowledge, but a more limited ability to use that knowledge.

Common Sense Reasoning Question Answering

Paper
Add Code

BloomVQA: Assessing Hierarchical Multi-modal Comprehension

no code implementations • 20 Dec 2023 • Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran

We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks.

Data Augmentation Memorization +2

Paper
Add Code

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval

no code implementations • 30 Nov 2023 • Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran

To provide a more thorough evaluation of the capabilities of long video retrieval systems, we propose a pipeline that leverages state-of-the-art large language models to carefully generate a diverse set of synthetic captions for long videos.

Benchmarking Retrieval +2

Paper
Add Code

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

no code implementations • 16 Nov 2023 • Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran

The critique NLF identifies the strengths and weaknesses of the responses and is used to align the LVLMs with human preferences.

Language Modelling

Paper
Add Code

Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models

1 code implementation • 8 Sep 2023 • Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran

Based on this pipeline and the existing coarse-grained annotated dataset, we build the CURE benchmark to measure both the zero-shot reasoning performance and consistency of VLMs.

Visual Reasoning

Paper
Code

Probing Conceptual Understanding of Large Visual-Language Models

1 code implementation • 7 Apr 2023 • Madeline Chantry Schiappa, Michael Cogswell, Ajay Divakaran, Yogesh Singh Rawat

In recent years large visual-language (V+L) models have achieved great success in various downstream tasks.

Benchmarking

Paper
Code

Unpacking Large Language Models with Conceptual Consistency

no code implementations • 29 Sep 2022 • Pritish Sahu, Michael Cogswell, Yunye Gong, Ajay Divakaran

The success of Large Language Models (LLMs) indicates they are increasingly able to answer queries like these accurately, but that ability does not necessarily imply a general understanding of concepts relevant to the anchor query.

Language Modelling Large Language Model

Paper
Add Code

Trigger Hunting with a Topological Prior for Trojan Detection

1 code implementation • ICLR 2022 • Xiaoling Hu, Xiao Lin, Michael Cogswell, Yi Yao, Susmit Jha, Chao Chen

Despite their success and popularity, deep neural networks (DNNs) are vulnerable when facing backdoor attacks.

Paper
Code

Improving Users' Mental Model with Attention-directed Counterfactual Edits

no code implementations • 13 Oct 2021 • Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas

In the domain of Visual Question Answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain Image-Question (IQ) pairs.

counterfactual Question Answering +2

Paper
Add Code

Comprehension Based Question Answering using Bloom's Taxonomy

no code implementations • 8 Jun 2021 • Pritish Sahu, Michael Cogswell, Sara Rutherford-Quach, Ajay Divakaran

Current pre-trained language models have lots of knowledge, but a more limited ability to use that knowledge.

Common Sense Reasoning Question Answering

Paper
Add Code

Generating and Evaluating Explanations of Attended and Error-Inducing Input Regions for VQA Models

no code implementations • 26 Mar 2021 • Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas

Hence, we propose Error Maps that clarify the error by highlighting image regions where the model is prone to err.

Question Answering Visual Question Answering

Paper
Add Code

Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data

1 code implementation • NeurIPS 2020 • Michael Cogswell, Jiasen Lu, Rishabh Jain, Stefan Lee, Devi Parikh, Dhruv Batra

Can we develop visually grounded dialog agents that can efficiently adapt to new tasks without forgetting how to talk to people?

Visual Dialog Visual Question Answering (VQA)

Paper
Code

Emergence of Compositional Language with Deep Generational Transmission

1 code implementation • ICLR 2020 • Michael Cogswell, Jiasen Lu, Stefan Lee, Devi Parikh, Dhruv Batra

In this paper, we introduce these cultural evolutionary dynamics into language emergence by periodically replacing agents in a population to create a knowledge gap, implicitly inducing cultural transmission of language.

Reinforcement Learning (RL)

Paper
Code

Grad-CAM: Why did you say that?

2 code implementations • 22 Nov 2016 • Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh, Dhruv Batra

We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations.

Image Captioning Visual Question Answering

9,389

Paper
Code

Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

124 code implementations • ICCV 2017 • Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra

For captioning and VQA, we show that even non-attention based models can localize inputs.

General Classification Image Classification +2

9,389

Paper
Code

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

25 code implementations • 7 Oct 2016 • Ashwin K. Vijayakumar, Michael Cogswell, Ramprasath R. Selvaraju, Qing Sun, Stefan Lee, David Crandall, Dhruv Batra

We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.

Image Captioning Machine Translation +4

29,193

Paper
Code

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

no code implementations • NeurIPS 2016 • Stefan Lee, Senthil Purushwalkam, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra

Many practical perception systems exist within larger processes that include interactions with users or additional components capable of evaluating the quality of predicted solutions.

Multiple-choice