One of the challenges for current sequence to sequence (seq2seq) models is processing long sequences, such as those in summarization and document level machine translation tasks.
Ranked #1 on Text Summarization on Pubmed
Building open-domain chatbots is a challenging area for machine learning research.
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #3 on Text Summarization on X-Sum
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
Ranked #2 on Natural Language Inference on ANLI test (using extra training data)
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.
Ranked #1 on Question Answering on TriviaQA (F1 metric)
We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems.
Ranked #8 on Constituency Parsing on Penn Treebank