Self-Critical Reasoning for Robust Visual Question Answering

NeurIPS 2019 Jialin WuRaymond J. Mooney

Visual Question Answering (VQA) deep-learning systems tend to capture superficial statistical correlations in the training data because of strong language priors and fail to generalize to test data with a significantly different question-answer (QA) distribution. To address this issue, we introduce a self-critical training objective that ensures that visual explanations of correct answers match the most influential image regions more than other competitive answer candidates... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Visual Question Answering VQA-CP UpDn+SCR (VQA-X) Score 49.45 # 2