Medical Visual Question Answering

32 papers with code • 5 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Medical Visual Question Answering

Dataset	Best Model	Compare
VQA-RAD	PeFoMed	See all
SLAKE-English	M2I2	See all
PathVQA	MUMC	See all
PMC-VQA	MedVInT	See all
OVQA	CLIP-ViT w/ GPT2 (LoRA)	See all

Libraries

Use these libraries to find Medical Visual Question Answering models and implementations

huggingface/transformers

2 papers

124,984

Datasets

Most implemented papers

Most implemented Social Latest No code

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

salesforce/lavis • • 30 Jan 2023

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

Paper
Code

PathVQA: 30000+ Questions for Medical Visual Question Answering

UCSD-AI4H/PathVQA • • 7 Mar 2020

To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.

Paper
Code

Flamingo: a Visual Language Model for Few-Shot Learning

mlfoundations/open_flamingo • • DeepMind 2022

Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.

Paper
Code

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs

microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 • 2 Mar 2023

Therefore, training an effective generalist biomedical model requires high-quality multimodal data, such as parallel image-text pairs.

Paper
Code

Overcoming Data Limitation in Medical Visual Question Answering

aioz-ai/MICCAI19-MedVQA • • 26 Sep 2019

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.

Paper
Code

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

pengfeiliheu/m2i2 • • 18 Feb 2021

We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems.

Paper
Code

Multiple Meta-model Quantifying for Medical Visual Question Answering

aioz-ai/MICCAI21_MMQ • • 19 May 2021

However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized.

Paper
Code

Self-supervised vision-language pretraining for Medical visual question answering

pengfeiliheu/m2i2 • • 24 Nov 2022

Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.

Paper
Code

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

xiaoman-zhang/PMC-VQA • • 17 May 2023

In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.

Paper
Code

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

baeseongsu/ehrxqa • NeurIPS 2023

To develop our dataset, we first construct two uni-modal resources: 1) The MIMIC-CXR-VQA dataset, our newly created medical visual question answering (VQA) benchmark, specifically designed to augment the imaging modality in EHR QA, and 2) EHRSQL (MIMIC-IV), a refashioned version of a previously established table-based EHR QA dataset.

Paper
Code

Medical Visual Question Answering

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result