Search Results for author: Richard Yuanzhe Pang

Found 22 papers, 9 papers with code

Self-Rewarding Language Models

2 code implementations18 Jan 2024 Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal.

Instruction Following Language Modelling

GPQA: A Graduate-Level Google-Proof Q&A Benchmark

1 code implementation20 Nov 2023 David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, Samuel R. Bowman

We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry.

Multiple-choice

Leveraging Implicit Feedback from Deployment Data in Dialogue

no code implementations26 Jul 2023 Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He, Jason Weston

We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations.

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

1 code implementation NeurIPS 2023 Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He

Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity.

Extrapolative Controlled Sequence Generation via Iterative Refinement

1 code implementation8 Mar 2023 Vishakh Padmakumar, Richard Yuanzhe Pang, He He, Ankur P. Parikh

We study the problem of extrapolative controlled generation, i. e., generating sequences with attribute values beyond the range seen in training.

Attribute Drug Discovery +1

Reward Gaming in Conditional Text Generation

no code implementations16 Nov 2022 Richard Yuanzhe Pang, Vishakh Padmakumar, Thibault Sellam, Ankur P. Parikh, He He

To align conditional text generation model outputs with desired behaviors, there has been an increasing focus on training the model using reinforcement learning (RL) with reward functions learned from human annotations.

Conditional Text Generation Reinforcement Learning (RL)

SQuALITY: Building a Long-Document Summarization Dataset the Hard Way

1 code implementation23 May 2022 Alex Wang, Richard Yuanzhe Pang, Angelica Chen, Jason Phang, Samuel R. Bowman

Summarization datasets are often assembled either by scraping naturally occurring public-domain summaries -- which are nearly always in difficult-to-work-with technical domains -- or by using approximate heuristics to extract them from everyday text -- which frequently yields unfaithful summaries.

Document Summarization Multiple-choice

QuALITY: Question Answering with Long Input Texts, Yes!

2 code implementations NAACL 2022 Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel R. Bowman

To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5, 000 tokens, much longer than typical current models can process.

Multiple-choice Multiple Choice Question Answering (MCQA)

Amortized Noisy Channel Neural Machine Translation

no code implementations16 Dec 2021 Richard Yuanzhe Pang, He He, Kyunghyun Cho

For all three approaches, the generated translations fail to achieve rewards comparable to BSR, but the translation quality approximated by BLEU and BLEURT is similar to the quality of BSR-produced translations.

Imitation Learning Knowledge Distillation +4

AgreeSum: Agreement-Oriented Multi-Document Summarization

no code implementations Findings (ACL) 2021 Richard Yuanzhe Pang, Adam D. Lelkes, Vinh Q. Tran, Cong Yu

Given the lack of existing datasets, we create a dataset for AgreeSum, and provide annotations on article-summary entailment relations for a subset of the clusters in the dataset.

Abstractive Text Summarization Document Summarization +1

Comparing Test Sets with Item Response Theory

no code implementations ACL 2021 Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman

Recent years have seen numerous NLP datasets introduced to evaluate the performance of fine-tuned models on natural language understanding tasks.

Natural Language Understanding

Text Generation by Learning from Demonstrations

1 code implementation ICLR 2021 Richard Yuanzhe Pang, He He

Current approaches to text generation largely rely on autoregressive models and maximum likelihood estimation.

Machine Translation Question Generation +4

ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation

1 code implementation ACL 2020 Lifu Tu, Richard Yuanzhe Pang, Sam Wiseman, Kevin Gimpel

We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model.

Machine Translation Translation

Consistency of a Recurrent Language Model With Respect to Incomplete Decoding

1 code implementation EMNLP 2020 Sean Welleck, Ilia Kulikov, Jaedeok Kim, Richard Yuanzhe Pang, Kyunghyun Cho

Despite strong performance on a variety of tasks, neural sequence models trained with maximum likelihood have been shown to exhibit issues such as length bias and degenerate repetition.

Language Modelling

Towards Actual (Not Operational) Textual Style Transfer Auto-Evaluation

no code implementations WS 2019 Richard Yuanzhe Pang

Regarding the problem of automatically generating paraphrases with modified styles or attributes, the difficulty lies in the lack of parallel corpora.

Semantic Similarity Semantic Textual Similarity +1

The Daunting Task of Real-World Textual Style Transfer Auto-Evaluation

no code implementations9 Oct 2019 Richard Yuanzhe Pang

The difficulty of textual style transfer lies in the lack of parallel corpora.

Style Transfer

Unsupervised Evaluation Metrics and Learning Criteria for Non-Parallel Textual Transfer

no code implementations WS 2019 Richard Yuanzhe Pang, Kevin Gimpel

We show that the metric of post-transfer classification accuracy is insufficient on its own, and propose additional metrics based on semantic preservation and fluency as well as a way to combine them into a single overall score.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.