Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer

Scarcity of parallel data causes formality style transfer models to have scarce success in preserving content. We show that fine-tuning pre-trained language (GPT-2) and sequence-to-sequence (BART) models boosts content preservation, and that this is possible even with limited amounts of parallel data. Augmenting these models with rewards that target style and content -- the two core aspects of the task -- we achieve a new state-of-the-art.

Results in Papers With Code
(↓ scroll down to see all results)