Unsupervised Text Summarization via Mixed Model Back-Translation

22 Aug 2019 · Yacine Jernite ·

Back-translation based approaches have recently lead to significant progress in unsupervised sequence-to-sequence tasks such as machine translation or style transfer. In this work, we extend the paradigm to the problem of learning a sentence summarization system from unaligned data. We present several initial models which rely on the asymmetrical nature of the task to perform the first back-translation step, and demonstrate the value of combining the data created by these diverse initialization methods. Our system outperforms the current state-of-the-art for unsupervised sentence summarization from fully unaligned data by over 2 ROUGE, and matches the performance of recent semi-supervised approaches.

PDF Abstract