Search Results for author: Adyasha Maharana

Found 14 papers, 11 papers with code

Multimodal Intent Discovery from Livestream Videos

no code implementations Findings (NAACL) 2022 Adyasha Maharana, Quan Tran, Franck Dernoncourt, Seunghyun Yoon, Trung Bui, Walter Chang, Mohit Bansal

We construct and present a new multimodal dataset consisting of software instructional livestreams and containing manual annotations for both detailed and abstract procedural intent that enable training and evaluation of joint video and text understanding models.

Intent Discovery Video Summarization +1

On Curriculum Learning for Commonsense Reasoning

1 code implementation NAACL 2022 Adyasha Maharana, Mohit Bansal

Hence, we examine the effect of a human-like easy-to-difficult curriculum during finetuning of language models for commonsense reasoning tasks.

Learning-To-Rank Natural Language Understanding +1

GraDA: Graph Generative Data Augmentation for Commonsense Reasoning

1 code implementation COLING 2022 Adyasha Maharana, Mohit Bansal

Recent advances in commonsense reasoning have been fueled by the availability of large-scale human annotated datasets.

Data Augmentation Knowledge Graphs

Integrating Visuospatial, Linguistic, and Commonsense Structure into Story Visualization

1 code implementation EMNLP 2021 Adyasha Maharana, Mohit Bansal

Such information is even more important for story visualization since its inputs have an explicit narrative structure that needs to be translated into an image sequence (or visual story).

Dense Captioning Image Generation +1

Evaluating Very Long-Term Conversational Memory of LLM Agents

no code implementations27 Feb 2024 Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang

Using this pipeline, we collect LoCoMo, a dataset of very long-term conversations, each encompassing 300 turns and 9K tokens on avg., over up to 35 sessions.

Avg Multi-modal Dialogue Generation +1

Debiasing Multimodal Models via Causal Information Minimization

1 code implementation28 Nov 2023 Vaidehi Patil, Adyasha Maharana, Mohit Bansal

In this paper, we study bias arising from confounders in a causal graph for multimodal data and examine a novel approach that leverages causally-motivated information minimization to learn the confounder representations.

Visual Question Answering (VQA)

D2 Pruning: Message Passing for Balancing Diversity and Difficulty in Data Pruning

1 code implementation11 Oct 2023 Adyasha Maharana, Prateek Yadav, Mohit Bansal

There are two dominant approaches: (1) geometry-based data selection for maximizing data diversity in the coreset, and (2) functions that assign difficulty scores to samples based on training dynamics.

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

1 code implementation28 Mar 2023 Adyasha Maharana, Amita Kamath, Christopher Clark, Mohit Bansal, Aniruddha Kembhavi

As general purpose vision models get increasingly effective at a wide set of tasks, it is imperative that they be consistent across the tasks they support.

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

1 code implementation13 Sep 2022 Adyasha Maharana, Darryl Hannan, Mohit Bansal

Hence, we first propose the task of story continuation, where the generated visual story is conditioned on a source image, allowing for better generalization to narratives with new characters.

Image Generation Story Continuation +2

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization

1 code implementation21 Oct 2021 Adyasha Maharana, Mohit Bansal

Prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality, consistency and relevance.

Dense Captioning Image Generation +1

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

1 code implementation NAACL 2021 Adyasha Maharana, Darryl Hannan, Mohit Bansal

Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images.

Image Generation Story Visualization +1

Adversarial Augmentation Policy Search for Domain and Cross-Lingual Generalization in Reading Comprehension

1 code implementation Findings of the Association for Computational Linguistics 2020 Adyasha Maharana, Mohit Bansal

In this work, we present several effective adversaries and automated data augmentation policy search methods with the goal of making reading comprehension models more robust to adversarial evaluation, but also improving generalization to the source domain as well as new domains and languages.

Data Augmentation Reading Comprehension

Clinical Event Detection with Hybrid Neural Architecture

no code implementations WS 2017 Adyasha Maharana, Meliha Yetisgen

Event detection from clinical notes has been traditionally solved with rule based and statistical natural language processing (NLP) approaches that require extensive domain knowledge and feature engineering.

Event Detection Feature Engineering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.