Medical Visual Question Answering

30 papers with code • 5 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Medical Visual Question Answering models and implementations

Latest papers with no code

Enhancing Generalization in Medical Visual Question Answering Tasks via Gradient-Guided Model Perturbation

no code yet • 5 Mar 2024

In this paper, we introduce a method that incorporates gradient-guided parameter perturbations to the visual encoder of the multimodality model during both pre-training and fine-tuning phases, to improve model generalization for downstream medical VQA tasks.

Prompt-based Personalized Federated Learning for Medical Visual Question Answering

no code yet • 15 Feb 2024

We present a novel prompt-based personalized federated learning (pFL) method to address data heterogeneity and privacy concerns in traditional medical visual question answering (VQA) methods.

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

no code yet • 14 Feb 2024

A significant challenge arises from the scarcity of diverse medical images spanning various modalities and anatomical regions, which is essential in real-world medical applications.

Free Form Medical Visual Question Answering in Radiology

no code yet • 23 Jan 2024

We innovatively augment the SLAKE dataset, enabling our model to respond to a more diverse array of questions, not limited to the immediate content of radiology or pathology images.

MISS: A Generative Pretraining and Finetuning Approach for Med-VQA

no code yet • 10 Jan 2024

However, most methods in the medical field treat VQA as an answer classification task which is difficult to transfer to practical application scenarios.

BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering

no code yet • 13 Dec 2023

Medical Visual Question Answering (Med-VQA) is a very important task in healthcare industry, which answers a natural language question with a medical image.

A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis

no code yet • 31 Oct 2023

This work conducts an evaluation of GPT-4V's multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.

Visual Question Answering in the Medical Domain

no code yet • 20 Sep 2023

Medical visual question answering (Med-VQA) is a machine learning task that aims to create a system that can answer natural language questions based on given medical images.

UIT-Saviors at MEDVQA-GI 2023: Improving Multimodal Learning with Image Enhancement for Gastrointestinal Visual Question Answering

no code yet • 6 Jul 2023

The ImageCLEFmed-MEDVQA-GI-2023 challenge carried out visual question answering task in the gastrointestinal domain, which includes gastroscopy and colonoscopy images.

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder

no code yet • 4 Apr 2023

To bridge this gap, in this paper, we propose a new Transformer based framework for medical VQA (named as Q2ATransformer), which integrates the advantages of both the classification and the generation approaches and provides a unified treatment for the close-end and open-end questions.