Search Results for author: John Wieting

Found 37 papers, 20 papers with code

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

no code implementations9 Sep 2023 Xinyi Wang, John Wieting, Jonathan H. Clark

Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning.

PaLM 2 Technical Report

no code implementations17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

 Ranked #1 on Question Answering on TriviaQA (using extra training data)

Language Modelling Question Answering

A Gold Standard Dataset for the Reviewer Assignment Problem

1 code implementation23 Mar 2023 Ivan Stelmakh, John Wieting, Graham Neubig, Nihar B. Shah

We address this challenge by collecting a novel dataset of similarity scores that we release to the research community.

Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense

1 code implementation23 Mar 2023 Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, Mohit Iyyer

To increase the robustness of AI-generated text detection to paraphrase attacks, we introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.

Language Modelling Paraphrase Generation +2

Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval

1 code implementation21 Dec 2022 John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, Taylor Berg-Kirkpatrick

Contrastive learning has been successfully used for retrieval of semantically aligned sentences, but it often requires large batch sizes or careful engineering to work well.

Contrastive Learning Open-Domain Question Answering +3

Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature

1 code implementation25 Oct 2022 Katherine Thai, Marzena Karpinska, Kalpesh Krishna, Bill Ray, Moira Inghilleri, John Wieting, Mohit Iyyer

Using Par3, we discover that expert literary translators prefer reference human translations over machine-translated paragraphs at a rate of 84%, while state-of-the-art automatic MT metrics do not correlate with those preferences.

Machine Translation Translation

QA Is the New KR: Question-Answer Pairs as Knowledge Bases

no code implementations1 Jul 2022 Wenhu Chen, William W. Cohen, Michiel de Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting

In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking.

Entity Linking Question Generation +1

RankGen: Improving Text Generation with Large Ranking Models

1 code implementation19 May 2022 Kalpesh Krishna, Yapei Chang, John Wieting, Mohit Iyyer

Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts.

Contrastive Learning Language Modelling +2

Faithful to the Document or to the World? Mitigating Hallucinations via Entity-linked Knowledge in Abstractive Summarization

no code implementations28 Apr 2022 Yue Dong, John Wieting, Pat Verga

In this work, we show that these entities are not aberrations, but they instead require utilizing external world knowledge to infer reasoning paths from entities in the source.

Abstractive Text Summarization

Paraphrastic Representations at Scale

1 code implementation30 Apr 2021 John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick

We train these models on large amounts of data, achieving significantly improved performance from the original papers proposing the methods on a suite of monolingual semantic similarity, cross-lingual semantic similarity, and bitext mining tasks.

Semantic Similarity Semantic Textual Similarity

CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

4 code implementations11 Mar 2021 Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting

Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet nearly all commonly-used models still require an explicit tokenization step.

Inductive Bias

On Learning Text Style Transfer with Direct Rewards

1 code implementation NAACL 2021 Yixin Liu, Graham Neubig, John Wieting

In most cases, the lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.

Machine Translation Semantic Similarity +4

Reformulating Unsupervised Style Transfer as Paraphrase Generation

1 code implementation EMNLP 2020 Kalpesh Krishna, John Wieting, Mohit Iyyer

Modern NLP defines the task of style transfer as modifying the style of a given sentence without appreciably changing its semantics, which implies that the outputs of style transfer systems should be paraphrases of their inputs.

Paraphrase Generation Style Transfer

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

1 code implementation TACL 2020 Shuyan Zhou, Shruti Rijhawani, John Wieting, Jaime Carbonell, Graham Neubig

Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts.

Cross-Lingual Entity Linking Entity Linking +1

A Bilingual Generative Transformer for Semantic Sentence Embedding

2 code implementations EMNLP 2020 John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences.

Semantic Similarity Semantic Textual Similarity +2

Simple and Effective Paraphrastic Similarity from Parallel Translations

4 code implementations ACL 2019 John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick

We present a model and methodology for learning paraphrastic sentence embeddings directly from bitext, removing the time-consuming intermediate step of creating paraphrase corpora.

Sentence Embeddings

Beyond BLEU: Training Neural Machine Translation with Semantic Similarity

1 code implementation14 Sep 2019 John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig

While most neural machine translation (NMT) systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can substantially improve final translation accuracy.

Machine Translation NMT +3

Beyond BLEU:Training Neural Machine Translation with Semantic Similarity

no code implementations ACL 2019 John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig

While most neural machine translation (NMT)systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can significantly improve final translation accuracy.

Machine Translation NMT +3

compare-mt: A Tool for Holistic Comparison of Language Generation Systems

2 code implementations NAACL 2019 Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, Xinyi Wang, John Wieting

In this paper, we describe compare-mt, a tool for holistic analysis and comparison of the results of systems for language generation tasks such as machine translation.

Machine Translation Text Generation +1

No Training Required: Exploring Random Encoders for Sentence Classification

1 code implementation ICLR 2019 John Wieting, Douwe Kiela

We explore various methods for computing sentence representations from pre-trained word embeddings without any training, i. e., using nothing but random parameterizations.

Classification General Classification +3

Adversarial Example Generation with Syntactically Controlled Paraphrase Networks

2 code implementations NAACL 2018 Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer

We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples.

Learning Paraphrastic Sentence Embeddings from Back-Translated Bitext

no code implementations EMNLP 2017 John Wieting, Jonathan Mallinson, Kevin Gimpel

We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b).

Machine Translation Sentence Embeddings +1

Revisiting Recurrent Networks for Paraphrastic Sentence Embeddings

no code implementations ACL 2017 John Wieting, Kevin Gimpel

We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b).

Sentence Embeddings Transfer Learning

Charagram: Embedding Words and Sentences via Character n-grams

no code implementations EMNLP 2016 John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu

We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences.

Part-Of-Speech Tagging Sentence Similarity +1

Towards Universal Paraphrastic Sentence Embeddings

no code implementations25 Nov 2015 John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu

We again find that the word averaging models perform well for sentence similarity and entailment, outperforming LSTMs.

General Classification Sentence Embeddings +3

Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm

no code implementations25 Aug 2015 Daniel Khashabi, John Wieting, Jeffrey Yufei Liu, Feng Liang

Empirical studies have been carried out to compare our work with many constrained clustering algorithms from the literature on both a variety of data sets and under a variety of conditions such as using noisy side information and erroneous k values.

Constrained Clustering

From Paraphrase Database to Compositional Paraphrase Model and Back

1 code implementation TACL 2015 John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu, Dan Roth

The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates.

Word Embeddings

Tiered Clustering to Improve Lexical Entailment

no code implementations2 Dec 2014 John Wieting

The second is a supervised approach where a classifier is learned to predict entailment given a concatenated latent vector representation of the word.

Clustering Lexical Entailment

Cannot find the paper you are looking for? You can Submit a new open access paper.