About

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Greatest papers with code

A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports

3 Sep 2020YIKUAN8/Transformers-VQA

Joint image-text embedding extracted from medical images and associated contextual reports is the bedrock for most biomedical vision-and-language (V+L) tasks, including medical visual question answering, clinical image-text retrieval, clinical report auto-generation.

MEDICAL VISUAL QUESTION ANSWERING QUESTION ANSWERING REPRESENTATION LEARNING VISUAL QUESTION ANSWERING

PathVQA: 30000+ Questions for Medical Visual Question Answering

7 Mar 2020UCSD-AI4H/PathVQA

To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.

MEDICAL VISUAL QUESTION ANSWERING QUESTION ANSWERING VISUAL QUESTION ANSWERING

Hierarchical Deep Multi-modal Network for Medical Visual Question Answering

27 Sep 2020Swati17293/HQS

To address this issue, we propose a hierarchical deep multi-modal network that analyzes and classifies end-user questions/queries and then incorporates a query-specific approach for answer prediction.

MEDICAL VISUAL QUESTION ANSWERING QUESTION ANSWERING VISUAL QUESTION ANSWERING