Search Results for author: Yinhan Liu

Found 8 papers, 6 papers with code

Cloze-driven Pretraining of Self-attention Networks

no code implementations • IJCNLP 2019 • Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli

We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems.

Ranked #10 on Constituency Parsing on Penn Treebank

Constituency Parsing NER +2

Paper
Add Code

Mask-Predict: Parallel Decoding of Conditional Masked Language Models

2 code implementations • IJCNLP 2019 • Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer

Most machine translation systems generate text autoregressively from left to right.

Language Modelling Machine Translation +2

238

Paper
Code

SpanBERT: Improving Pre-training by Representing and Predicting Spans

6 code implementations • TACL 2020 • Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S. Weld, Luke Zettlemoyer, Omer Levy

We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text.

Ranked #1 on Question Answering on NewsQA (F1 metric)

Linguistic Acceptability Natural Language Inference +4

874

Paper
Code

RoBERTa: A Robustly Optimized BERT Pretraining Approach

59 code implementations • 26 Jul 2019 • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Ranked #1 on Only Connect Walls Dataset Task 1 (Grouping) on OCW (Wasserstein Distance (WD) metric, using extra training data)

Document Image Classification Language Modelling +13

124,457

Paper
Code

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

43 code implementations • ACL 2020 • Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdel-rahman Mohamed, Omer Levy, Ves Stoyanov, Luke Zettlemoyer

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Ranked #3 on Open-Domain Question Answering on ELI5

Abstractive Text Summarization Denoising +5

124,457

Paper
Code

Multilingual Denoising Pre-training for Neural Machine Translation

5 code implementations • 22 Jan 2020 • Yinhan Liu, Jiatao Gu, Naman Goyal, Xi-An Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

Denoising Sentence +2

124,457

Paper
Code

Recipes for building an open-domain chatbot

7 code implementations • EACL 2021 • Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston

Building open-domain chatbots is a challenging area for machine learning research.

Chatbot

124,457

Paper
Code

Hierarchical Learning for Generation with Long Source Sequences

no code implementations • 15 Apr 2021 • Tobias Rohde, Xiaoxia Wu, Yinhan Liu

One of the challenges for current sequence to sequence (seq2seq) models is processing long sequences, such as those in summarization and document level machine translation tasks.

Ranked #1 on Document Level Machine Translation on WMT2019 English-German

Document Level Machine Translation Document Summarization +6

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.