Medical Report Generation
14 papers with code • 1 benchmarks • 2 datasets
Medical report generation (MRG) is a task which focus on training AI to automatically generate professional report according the input image data. This can help clinicians make faster and more accurate decision since the task itself is both time consuming and error prone even for experienced doctors.
Deep neural network and transformer based architecture are currently the most popular methods for this certain task, however, when we try to transfer out pre-trained model into this certain domain, their performance always degrade.
The following are some of the reasons why RSG is hard for pre-trained models:
- Language datasets in a particular domain can sometimes be quite different from the large number of datasets available on the Internet
- During the fine-tuning phase, datasets in the medical field are often unevenly distributed
More recently, multi-modal learning and contrastive learning have shown some inspiring results in this field, but it's still challenging and requires further attention.
Here are some additional readings to go deeper on the task:
- On the Automatic Generation of Medical Imaging Reports
https://doi.org/10.48550/arXiv.1711.08195
- A scoping review of transfer learning research on medical image analysis using ImageNet
https://arxiv.org/abs/2004.13175
- A Survey on Incorporating Domain Knowledge into Deep Learning for Medical Image Analysis
https://arxiv.org/abs/2004.12150
(Image credit : Transformers in Medical Imaging: A Survey)
Most implemented papers
On the Automatic Generation of Medical Imaging Reports
To cope with these challenges, we (1) build a multi-task learning framework which jointly performs the pre- diction of tags and the generation of para- graphs, (2) propose a co-attention mechanism to localize regions containing abnormalities and generate narrations for them, (3) develop a hierarchical LSTM model to generate long paragraphs.
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure.
DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual Explanation
To train and validate the effectiveness of our DNN-based module, we propose a large-scale retinal disease image dataset.
Inspecting state of the art performance and NLP metrics in image-based medical report generation
Several deep learning architectures have been proposed over the last years to deal with the problem of generating a written report given an imaging exam as input.
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning
To the best of our knowledge, this is the first work that improves data efficiency of image captioning by utilizing LM pretrained on unimodal data.
Automated radiology report generation using conditioned transformers
We represent the first work to condition a pre-trained transformer on visual and semantic features to generate medical reports and to include semantic similarity metrics in the quantitative analysis of the generated reports.
FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark
Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports.
Transformers in Medical Imaging: A Survey
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators.
A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets
In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.
M^4I: Multi-modal Models Membership Inference
To achieve this, we propose Multi-modal Models Membership Inference (M^4I) with two attack methods to infer the membership status, named metric-based (MB) M^4I and feature-based (FB) M^4I, respectively.