Image ad understanding is a crucial task with wide real-world applications.
no code implementations • • Arjun R. Akula, Brendan Driscoll, Pradyumna Narayana, Soravit Changpinyo, Zhiwei Jia, Suyash Damle, Garima Pruthi, Sugato Basu, Leonidas Guibas, William T. Freeman, Yuanzhen Li, Varun Jampani
Towards this goal, we introduce MetaCLUE, a set of vision tasks on visual metaphor.
Visual question answering (VQA) is the multi-modal task of answering natural language questions about an input image.
More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn't.
We present a new explainable AI (XAI) framework aimed at increasing justified human trust and reliance in the AI machine through explanations.
This paper presents an explainable AI (XAI) system that provides explanations for its predictions.