Medical Visual Question Answering

32 papers with code • 5 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Medical Visual Question Answering models and implementations

LaPA: Latent Prompt Assist Model For Medical Visual Question Answering

garygutc/lapa_model 19 Apr 2024

In this paper, we propose the Latent Prompt Assist model (LaPA) for medical visual question answering.

2
19 Apr 2024

MedPromptX: Grounded Multimodal Prompting for Chest X-ray Diagnosis

biomedia-mbzuai/medpromptx 22 Mar 2024

Chest X-ray images are commonly used for predicting acute and chronic cardiopulmonary conditions, but efforts to integrate them with structured clinical data face challenges due to incomplete electronic health records (EHR).

43
22 Mar 2024

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

opengvlab/multi-modality-arena 14 Feb 2024

Importantly, all images in this benchmark are sourced from authentic medical scenarios, ensuring alignment with the requirements of the medical field and suitability for evaluating LVLMs.

362
14 Feb 2024

Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & Hallucinations

promptslab/rosettaeval 10 Feb 2024

Additionally, we facilitated future research and development by releasing a Python module for medical LLM evaluation and establishing a dedicated leaderboard on Hugging Face for medical domain LLMs.

4
10 Feb 2024

Hallucination Benchmark in Medical Visual Question Answering

knowlab/halt-medvqa 11 Jan 2024

The recent success of large language and vision models (LLVMs) on vision question answering (VQA), particularly their applications in medicine (Med-VQA), has shown a great potential of realizing effective visual assistants for healthcare.

1
11 Jan 2024

PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging

jinlhe/pefomed 5 Jan 2024

In this paper, we propose a parameter efficient framework for fine-tuning MLLMs, specifically validated on medical visual question answering (Med-VQA) and medical report generation (MRG) tasks, using public benchmark datasets.

9
05 Jan 2024

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

baeseongsu/ehrxqa NeurIPS 2023

To develop our dataset, we first construct two uni-modal resources: 1) The MIMIC-CXR-VQA dataset, our newly created medical visual question answering (VQA) benchmark, specifically designed to augment the imaging modality in EHR QA, and 2) EHRSQL (MIMIC-IV), a refashioned version of a previously established table-based EHR QA dataset.

48
28 Oct 2023

Med-Flamingo: a Multimodal Medical Few-shot Learner

snap-stanford/med-flamingo 27 Jul 2023

However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time.

352
27 Jul 2023

Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

holipori/mimic-diff-vqa 22 Jul 2023

Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them.

49
22 Jul 2023

Masked Vision and Language Pre-training with Unimodal and Multimodal Contrastive Losses for Medical Visual Question Answering

pengfeiliheu/mumc 11 Jul 2023

Medical visual question answering (VQA) is a challenging task that requires answering clinical questions of a given medical image, by taking consider of both visual and language information.

23
11 Jul 2023