Medical Visual Question Answering

8 papers with code • 3 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Most implemented papers

PathVQA: 30000+ Questions for Medical Visual Question Answering

UCSD-AI4H/PathVQA 7 Mar 2020

To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.

Overcoming Data Limitation in Medical Visual Question Answering

aioz-ai/MICCAI19-MedVQA 26 Sep 2019

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.

Multiple Meta-model Quantifying for Medical Visual Question Answering

aioz-ai/MICCAI21_MMQ 19 May 2021

However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized.

A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports

YIKUAN8/Transformers-VQA 3 Sep 2020

Joint image-text embedding extracted from medical images and associated contextual reports is the bedrock for most biomedical vision-and-language (V+L) tasks, including medical visual question answering, clinical image-text retrieval, clinical report auto-generation.

Hierarchical Deep Multi-modal Network for Medical Visual Question Answering

Swati17293/HQS 27 Sep 2020

To address this issue, we propose a hierarchical deep multi-modal network that analyzes and classifies end-user questions/queries and then incorporates a query-specific approach for answer prediction.

Multi-modal Understanding and Generation for Medical Images and Text via Vision-Language Pre-Training

SuperSupermoon/MedViLL 24 May 2021

We propose a new model which adopts a Transformer based architecture combined with a novel multimodal attention masking scheme to maximize generalization performance for both vision-language understanding task (e. g., diagnosis classification) and vision-language generation task (e. g., radiology report generation).

Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?

sarahesl/pubmedclip 27 Dec 2021

This work evaluates the effectiveness of CLIP for the task of Medical Visual Question Answering (MedVQA).