MedTSS: transforming abstractive summarization of scientific articles with linguistic analysis and concept reinforcement

This research addresses the limitations of pretrained models (PTMs) in generating accurate and comprehensive abstractive summaries for scientific articles, with a specific focus on the challenges posed by medical research. The proposed solution named medical text simplification and summarization (MedTSS) introduces a dedicated module designed to enrich source text for PTMs. MedTSS addresses issues related to token limits, reinforces multiple concepts, and mitigates entity hallucination problems without necessitating additional training. Furthermore, the module conducts linguistic analysis to simplify generated summaries, particularly tailored for the complex nature of medical research articles. The results demonstrate a significant enhancement, with MedTSS improving the Rouge-1 score from 16.46 to 35.17 without requiring additional training. By emphasizing knowledge-driven components, this framework offers a distinct perspective, challenging the common narrative of ’more data’ or ’more parameters.’ This alternative approach, especially applicable in health-related domains, signifies a broader contribution to the field of NLP. MedTSS serves as an innovative model that not only addresses the intricacies of medical research summarization but also presents a paradigm shift with implications for diverse domains beyond its initial scope.

PDF

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Text Simplification EurekaAlert MedTSS-BART (Without Training) Rouge1 35.17 # 1

Methods