Medical Visual Question Answering

32 papers with code • 5 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Medical Visual Question Answering models and implementations

Most implemented papers

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

salesforce/lavis 30 Jan 2023

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

PathVQA: 30000+ Questions for Medical Visual Question Answering

UCSD-AI4H/PathVQA 7 Mar 2020

To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.

Flamingo: a Visual Language Model for Few-Shot Learning

mlfoundations/open_flamingo DeepMind 2022

Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs

microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 2 Mar 2023

Therefore, training an effective generalist biomedical model requires high-quality multimodal data, such as parallel image-text pairs.

Overcoming Data Limitation in Medical Visual Question Answering

aioz-ai/MICCAI19-MedVQA 26 Sep 2019

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

pengfeiliheu/m2i2 18 Feb 2021

We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems.

Multiple Meta-model Quantifying for Medical Visual Question Answering

aioz-ai/MICCAI21_MMQ 19 May 2021

However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized.

Self-supervised vision-language pretraining for Medical visual question answering

pengfeiliheu/m2i2 24 Nov 2022

Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

xiaoman-zhang/PMC-VQA 17 May 2023

In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

baeseongsu/ehrxqa NeurIPS 2023

To develop our dataset, we first construct two uni-modal resources: 1) The MIMIC-CXR-VQA dataset, our newly created medical visual question answering (VQA) benchmark, specifically designed to augment the imaging modality in EHR QA, and 2) EHRSQL (MIMIC-IV), a refashioned version of a previously established table-based EHR QA dataset.