18 papers with code • 1 benchmarks • 3 datasets
Sentence Fusion is the task of combining several independent sentences into a single coherent text. Sentence Fusion is important in many NLP applications, including retrieval-based dialogue, text summarization and question answering.
There is thus a crucial gap between sentence selection and fusion to support summarizing by both compressing single sentences and fusing pairs.
We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two independent sentences.
We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input.
We create a dataset containing the documents, source and fusion sentences, and human annotations of points of correspondence between sentences.
For text normalization, sentence fusion, and grammatical error correction, our approach improves explainability by associating each edit operation with a human-readable tag.
The ability to fuse sentences is highly attractive for summarization systems because it is an essential step to produce succinct abstracts.
Our approach maximizes the completeness and semantic accuracy of the output text while leveraging the abilities of recent pre-trained models for text editing (LaserTagger) and language modeling (GPT-2) to improve the text fluency.