Search Results for author: Arijit Ray

Found 12 papers, 1 papers with code

Lasagna: Layered Score Distillation for Disentangled Object Relighting

1 code implementation30 Nov 2023 Dina Bashkirova, Arijit Ray, Rupayan Mallick, Sarah Adel Bargal, Jianming Zhang, Ranjay Krishna, Kate Saenko

Although generative editing methods now enable some forms of image editing, relighting is still beyond today's capabilities; existing methods struggle to keep other aspects of the image -- colors, shapes, and textures -- consistent after the edit.

Colorization Object +1

Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention

no code implementations15 Feb 2019 Shalini Ghosh, Giedrius Burachas, Arijit Ray, Avi Ziskind

In this paper, we present a novel approach for the task of eXplainable Question Answering (XQA), i. e., generating natural language (NL) explanations for the Visual Question Answering (VQA) problem.

Explanation Generation Language Modelling +2

Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval

no code implementations5 Apr 2019 Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas

Our experiments, therefore, demonstrate that ExAG is an effective means to evaluate the efficacy of AI-generated explanations on a human-AI collaborative task.

Image Retrieval Question Answering +2

The Impact of Explanations on AI Competency Prediction in VQA

no code implementations2 Jul 2020 Kamran Alipour, Arijit Ray, Xiao Lin, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas

In this paper, we evaluate the impact of explanations on the user's mental model of AI agent competency within the task of visual question answering (VQA).

Language Modelling Question Answering +1

Improving Users' Mental Model with Attention-directed Counterfactual Edits

no code implementations13 Oct 2021 Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas

In the domain of Visual Question Answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain Image-Question (IQ) pairs.

counterfactual Question Answering +2

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

no code implementations CVPR 2023 Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko

We propose a self-supervised approach for learning to perform audio source separation in videos based on natural language queries, using only unlabeled video and audio pairs as training data.

Audio Source Separation Natural Language Queries

Socratis: Are large multimodal models emotionally aware?

no code implementations31 Aug 2023 Katherine Deng, Arijit Ray, Reuben Tan, Saadia Gabriel, Bryan A. Plummer, Kate Saenko

We further see that current captioning metrics based on large vision-language models also fail to correlate with human preferences.

BloomVQA: Assessing Hierarchical Multi-modal Comprehension

no code implementations20 Dec 2023 Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran

We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks.

Data Augmentation Memorization +2

Cannot find the paper you are looking for? You can Submit a new open access paper.