Search Results for author: Mark Yatskar

Found 30 papers, 21 papers with code

Crowdsourcing Beyond Annotation: Case Studies in Benchmark Data Collection

no code implementations EMNLP (ACL) 2021 Alane Suhr, Clara Vania, Nikita Nangia, Maarten Sap, Mark Yatskar, Samuel R. Bowman, Yoav Artzi

Even though it is such a fundamental tool in NLP, crowdsourcing use is largely guided by common practices and the personal experience of researchers.

Interpretable by Design Visual Question Answering

no code implementations24 May 2023 Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, Dan Roth

Model interpretability has long been a hard problem for the AI community especially in the multimodal setting, where vision and language need to be aligned and reasoned at the same time.

Question Answering Visual Question Answering

AmbiCoref: Evaluating Human and Model Sensitivity to Ambiguous Coreference

1 code implementation1 Feb 2023 Yuewei Yuan, Chaitanya Malaviya, Mark Yatskar

To this end, we construct AmbiCoref, a diagnostic corpus of minimal sentence pairs with ambiguous and unambiguous referents.

coreference-resolution Coreference Resolution

Visualizing the Obvious: A Concreteness-based Ensemble Model for Noun Property Prediction

1 code implementation24 Oct 2022 Yue Yang, Artemis Panagopoulou, Marianna Apidianaki, Mark Yatskar, Chris Callison-Burch

We propose to extract these properties from images and use them in an ensemble model, in order to complement the information that is extracted from language models.

Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models

1 code implementation24 Oct 2022 Chaitanya Malaviya, Sudeep Bhatia, Mark Yatskar

Cognitive psychologists have documented that humans use cognitive heuristics, or mental shortcuts, to make quick decisions while expending less effort.

Multiple-choice Reading Comprehension

Induce, Edit, Retrieve:Language Grounded Multimodal Schema for Instructional Video Retrieval

no code implementations17 Nov 2021 Yue Yang, Joongwon Kim, Artemis Panagopoulou, Mark Yatskar, Chris Callison-Burch

Schemata are structured representations of complex tasks that can aid artificial intelligence by allowing models to break down complex tasks into intermediate steps.

Retrieval Video Retrieval

Visual Goal-Step Inference using wikiHow

1 code implementation EMNLP 2021 Yue Yang, Artemis Panagopoulou, Qing Lyu, Li Zhang, Mark Yatskar, Chris Callison-Burch

Understanding what sequence of steps are needed to complete a goal can help artificial intelligence systems reason about human activities.


Visual Semantic Role Labeling for Video Understanding

1 code implementation CVPR 2021 Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi

We propose a new framework for understanding and representing related salient events in a video using visual semantic role labeling.

Semantic Role Labeling Video Recognition +1

Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles

1 code implementation Findings of the Association for Computational Linguistics 2020 Christopher Clark, Mark Yatskar, Luke Zettlemoyer

We evaluate performance on synthetic datasets, and four datasets built to penalize models that exploit known biases on textual entailment, visual question answering, and image recognition tasks.

Natural Language Inference Question Answering +1

What Does BERT with Vision Look At?

no code implementations ACL 2020 Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang

Pre-trained visually grounded language models such as ViLBERT, LXMERT, and UNITER have achieved significant performance improvement on vision-and-language tasks but what they learn during pre-training remains unclear.

Language Modelling

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

1 code implementation CVPR 2020 Matt Deitke, Winson Han, Alvaro Herrasti, Aniruddha Kembhavi, Eric Kolve, Roozbeh Mottaghi, Jordi Salvador, Dustin Schwenk, Eli VanderBilt, Matthew Wallingford, Luca Weihs, Mark Yatskar, Ali Farhadi

We argue that interactive and embodied visual AI has reached a stage of development similar to visual recognition prior to the advent of these ecosystems.

Grounded Situation Recognition

1 code implementation ECCV 2020 Sarah Pratt, Mark Yatskar, Luca Weihs, Ali Farhadi, Aniruddha Kembhavi

We introduce Grounded Situation Recognition (GSR), a task that requires producing structured semantic summaries of images describing: the primary activity, entities engaged in the activity with their roles (e. g. agent, tool), and bounding-box groundings of entities.

Grounded Situation Recognition Image Retrieval +1

Don't Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases

3 code implementations IJCNLP 2019 Christopher Clark, Mark Yatskar, Luke Zettlemoyer

Our method has two stages: we (1) train a naive model that makes predictions exclusively based on dataset biases, and (2) train a robust model as part of an ensemble with the naive one in order to encourage it to focus on other patterns in the data that are more likely to generalize.

Natural Language Inference Question Answering +1

Gender Bias in Contextualized Word Embeddings

1 code implementation NAACL 2019 Jieyu Zhao, Tianlu Wang, Mark Yatskar, Ryan Cotterell, Vicente Ordonez, Kai-Wei Chang

In this paper, we quantify, analyze and mitigate gender bias exhibited in ELMo's contextualized word vectors.

Word Embeddings

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

2 code implementations ICCV 2019 Tianlu Wang, Jieyu Zhao, Mark Yatskar, Kai-Wei Chang, Vicente Ordonez

In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables --such as gender-- in visual recognition tasks.

Temporal Action Localization

QuAC: Question Answering in Context

no code implementations EMNLP 2018 Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer

We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total).

Question Answering Reading Comprehension

A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC

1 code implementation NAACL 2019 Mark Yatskar

We compare three new datasets for question answering: SQuAD 2. 0, QuAC, and CoQA, along several of their new features: (1) unanswerable questions, (2) multi-turn interactions, and (3) abstractive answers.

Question Answering

QuAC : Question Answering in Context

no code implementations21 Aug 2018 Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, Luke Zettlemoyer

We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total).

Question Answering Reading Comprehension

Neural Motifs: Scene Graph Parsing with Global Context

6 code implementations CVPR 2018 Rowan Zellers, Mark Yatskar, Sam Thomson, Yejin Choi

We then introduce Stacked Motif Networks, a new architecture designed to capture higher order motifs in scene graphs that further improves over our strong baseline by an average 7. 1% relative gain.

Panoptic Scene Graph Generation

Commonly Uncommon: Semantic Sparsity in Situation Recognition

2 code implementations CVPR 2017 Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set.

Grounded Situation Recognition Structured Prediction

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

1 code implementation CVPR 2016 Mark Yatskar, Luke Zettlemoyer, Ali Farhadi

This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e. g., clipping), (2) the participating actors, objects, substances, and locations (e. g., man, shears, sheep, wool, and field) and most importantly (3) the roles these participants play in the activity (e. g., the man is clipping, the shears are his tool, the wool is being clipped from the sheep, and the clipping is in a field).

Activity Recognition Grounded Situation Recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.