ViQuAE is a dataset for KVQAE (Knowledge-based Visual Question Answering about named Entities), a task which consists in answering questions about named entities grounded in a visual context using a Knowledge Base. It is the first KVQAE dataset to cover a wide range of entity types (e.g. persons, landmarks, and products). We argue that KVQAE is a clear, well-defined task that can be evaluated easily, making it suitable to track the progress of multimodal entity representation’s quality. Multimodal entity representation is a central issue that will allow to make human-machine interactions more natural. For example, while watching a movie, one might wonder ‘‘Where did I already see this actress?’’ or ‘‘Did she ever win an Oscar?’’


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.


Similar Datasets