Medical Visual Question Answering

26 papers with code • 5 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

salesforce/lavis 30 Jan 2023

The cost of vision-and-language pre-training has become increasingly prohibitive due to end-to-end training of large-scale models.

PathVQA: 30000+ Questions for Medical Visual Question Answering

UCSD-AI4H/PathVQA 7 Mar 2020

To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer.

Flamingo: a Visual Language Model for Few-Shot Learning

mlfoundations/open_flamingo DeepMind 2022

Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.

Overcoming Data Limitation in Medical Visual Question Answering

aioz-ai/MICCAI19-MedVQA 26 Sep 2019

Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training.

SLAKE: A Semantically-Labeled Knowledge-Enhanced Dataset for Medical Visual Question Answering

pengfeiliheu/m2i2 18 Feb 2021

We show that SLAKE can be used to facilitate the development and evaluation of Med-VQA systems.

Multiple Meta-model Quantifying for Medical Visual Question Answering

aioz-ai/MICCAI21_MMQ 19 May 2021

However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized.

Self-supervised vision-language pretraining for Medical visual question answering

pengfeiliheu/m2i2 24 Nov 2022

Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.

PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering

xiaoman-zhang/PMC-VQA 17 May 2023

In this paper, we focus on the problem of Medical Visual Question Answering (MedVQA), which is crucial in efficiently interpreting medical images with vital clinic-relevant information.

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day

haotian-liu/LLaVA 1 Jun 2023

In this paper, we propose a cost-efficient approach for training a vision-language conversational assistant that can answer open-ended research questions of biomedical images.

A Comparison of Pre-trained Vision-and-Language Models for Multimodal Representation Learning across Medical Images and Reports

YIKUAN8/Transformers-VQA 3 Sep 2020

Joint image-text embedding extracted from medical images and associated contextual reports is the bedrock for most biomedical vision-and-language (V+L) tasks, including medical visual question answering, clinical image-text retrieval, clinical report auto-generation.