Medical Visual Question Answering

32 papers with code • 5 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Medical Visual Question Answering

Dataset	Best Model	Compare
VQA-RAD	PeFoMed	See all
SLAKE-English	M2I2	See all
PathVQA	MUMC	See all
PMC-VQA	MedVInT	See all
OVQA	CLIP-ViT w/ GPT2 (LoRA)	See all

Libraries

Use these libraries to find Medical Visual Question Answering models and implementations

huggingface/transformers

2 papers

124,984

Datasets

Latest papers

Most implemented Social Latest No code

LaPA: Latent Prompt Assist Model For Medical Visual Question Answering

garygutc/lapa_model • • 19 Apr 2024

In this paper, we propose the Latent Prompt Assist model (LaPA) for medical visual question answering.

19 Apr 2024

Paper
Code

MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

biomedia-mbzuai/medpromptx • • 22 Mar 2024

Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions, but efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records (EHR).

22 Mar 2024

Paper
Code

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

opengvlab/multi-modality-arena • • 14 Feb 2024

Importantly, all images in this benchmark are sourced from authentic medical scenarios, ensuring alignment with the requirements of the medical field and suitability for evaluating LVLMs.

362

14 Feb 2024

Paper
Code

Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations

promptslab/rosettaeval • 10 Feb 2024

Additionally, we facilitated future research and development by releasing a Python module for medical LLM evaluation and establishing a dedicated leaderboard on Hugging Face for medical domain LLMs.

10 Feb 2024

Paper
Code

Hallucination Benchmark in Medical Visual Question Answering

knowlab/halt-medvqa • • 11 Jan 2024

The recent success of large language and vision models (LLVMs) on vision question answering (VQA), particularly their applications in medicine (Med-VQA), has shown a great potential of realizing effective visual assistants for healthcare.

11 Jan 2024

Paper
Code

PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging

jinlhe/pefomed • 5 Jan 2024

In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.

05 Jan 2024

Paper
Code

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

baeseongsu/ehrxqa • NeurIPS 2023

To develop our dataset, we first construct two uni-modal resources: 1) The MIMIC-CXR-VQA dataset, our newly created medical visual question answering (VQA) benchmark, specifically designed to augment the imaging modality in EHR QA, and 2) EHRSQL (MIMIC-IV), a refashioned version of a previously established table-based EHR QA dataset.

28 Oct 2023

Paper
Code

Med-Flamingo: a Multimodal Medical Few-shot Learner

snap-stanford/med-flamingo • • 27 Jul 2023

However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time.

352

27 Jul 2023

Paper
Code

Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

holipori/mimic-diff-vqa • 22 Jul 2023

Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them.

22 Jul 2023

Paper
Code

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

pengfeiliheu/mumc • • 11 Jul 2023

Medical visual question answering (VQA) is a challenging task that requires answering clinical questions of a given medical image, by taking consider of both visual and language information.

11 Jul 2023

Paper
Code

Medical Visual Question Answering

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result