Search Results for author: Yinhan Liu

Found 8 papers, 6 papers with code

Hierarchical Learning for Generation with Long Source Sequences

no code implementations15 Apr 2021 Tobias Rohde, Xiaoxia Wu, Yinhan Liu

One of the challenges for current sequence to sequence (seq2seq) models is processing long sequences, such as those in summarization and document level machine translation tasks.

Classification Document-level +6

Multilingual Denoising Pre-training for Neural Machine Translation

3 code implementations22 Jan 2020 Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Denoising Document-level +1

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

23 code implementations ACL 2020 Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Abstractive Text Summarization Denoising +4

RoBERTa: A Robustly Optimized BERT Pretraining Approach

40 code implementations26 Jul 2019 Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Ranked #2 on Natural Language Inference on ANLI test (using extra training data)

Common Sense Reasoning Language Modelling +6

Cloze-driven Pretraining of Self-attention Networks

no code implementations IJCNLP 2019 Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli

We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems.

Constituency Parsing Named Entity Recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.