Medical Visual Question Answering
30 papers with code • 5 benchmarks • 7 datasets
Libraries
Use these libraries to find Medical Visual Question Answering models and implementationsLatest papers with no code
Enhancing Generalization in Medical Visual Question Answering Tasks via Gradient-Guided Model Perturbation
In this paper, we introduce a method that incorporates gradient-guided parameter perturbations to the visual encoder of the multimodality model during both pre-training and fine-tuning phases, to improve model generalization for downstream medical VQA tasks.
Prompt-based Personalized Federated Learning for Medical Visual Question Answering
We present a novel prompt-based personalized federated learning (pFL) method to address data heterogeneity and privacy concerns in traditional medical visual question answering (VQA) methods.
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM
A significant challenge arises from the scarcity of diverse medical images spanning various modalities and anatomical regions, which is essential in real-world medical applications.
Free Form Medical Visual Question Answering in Radiology
We innovatively augment the SLAKE dataset, enabling our model to respond to a more diverse array of questions, not limited to the immediate content of radiology or pathology images.
MISS: A Generative Pretraining and Finetuning Approach for Med-VQA
However, most methods in the medical field treat VQA as an answer classification task which is difficult to transfer to practical application scenarios.
BESTMVQA: A Benchmark Evaluation System for Medical Visual Question Answering
Medical Visual Question Answering (Med-VQA) is a very important task in healthcare industry, which answers a natural language question with a medical image.
A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis
This work conducts an evaluation of GPT-4V's multimodal capability for medical image analysis, with a focus on three representative tasks of radiology report generation, medical visual question answering, and medical visual grounding.
Visual Question Answering in the Medical Domain
Medical visual question answering (Med-VQA) is a machine learning task that aims to create a system that can answer natural language questions based on given medical images.
UIT-Saviors at MEDVQA-GI 2023: Improving Multimodal Learning with Image Enhancement for Gastrointestinal Visual Question Answering
The ImageCLEFmed-MEDVQA-GI-2023 challenge carried out visual question answering task in the gastrointestinal domain, which includes gastroscopy and colonoscopy images.
Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder
To bridge this gap, in this paper, we propose a new Transformer based framework for medical VQA (named as Q2ATransformer), which integrates the advantages of both the classification and the generation approaches and provides a unified treatment for the close-end and open-end questions.