no code implementations • 9 Sep 2023 • Xinyi Wang, John Wieting, Jonathan H. Clark
Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning.
no code implementations • 23 May 2023 • Benjamin Muller, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang
Based on these models, we improve the attribution level of a cross-lingual question-answering system.
1 code implementation • 19 May 2023 • Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar
We evaluate commonly used models on the benchmark.
no code implementations • 17 May 2023 • Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu
Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.
Ranked #1 on
Question Answering
on TriviaQA
(using extra training data)
1 code implementation • 23 Mar 2023 • Ivan Stelmakh, John Wieting, Graham Neubig, Nihar B. Shah
We address this challenge by collecting a novel dataset of similarity scores that we release to the research community.
1 code implementation • 23 Mar 2023 • Kalpesh Krishna, Yixiao Song, Marzena Karpinska, John Wieting, Mohit Iyyer
To increase the robustness of AI-generated text detection to paraphrase attacks, we introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
1 code implementation • 21 Dec 2022 • John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, Taylor Berg-Kirkpatrick
Contrastive learning has been successfully used for retrieval of semantically aligned sentences, but it often requires large batch sizes or careful engineering to work well.
1 code implementation • 25 Oct 2022 • Katherine Thai, Marzena Karpinska, Kalpesh Krishna, Bill Ray, Moira Inghilleri, John Wieting, Mohit Iyyer
Using Par3, we discover that expert literary translators prefer reference human translations over machine-translated paragraphs at a rate of 84%, while state-of-the-art automatic MT metrics do not correlate with those preferences.
no code implementations • 1 Jul 2022 • Wenhu Chen, William W. Cohen, Michiel de Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting
In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking.
1 code implementation • 19 May 2022 • Kalpesh Krishna, Yapei Chang, John Wieting, Mohit Iyyer
Given an input sequence (or prefix), modern language models often assign high probabilities to output sequences that are repetitive, incoherent, or irrelevant to the prefix; as such, model-generated text also contains such artifacts.
no code implementations • 28 Apr 2022 • Yue Dong, John Wieting, Pat Verga
In this work, we show that these entities are not aberrations, but they instead require utilizing external world knowledge to infer reasoning paths from entities in the source.
no code implementations • 10 Apr 2022 • Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen
Retrieval augmented language models have recently become the standard for knowledge intensive tasks.
1 code implementation • EMNLP (MRL) 2021 • Monisha Jegadeesan, Sachin Kumar, John Wieting, Yulia Tsvetkov
We present a novel technique for zero-shot paraphrase generation.
no code implementations • ACL 2022 • Pengcheng Yin, John Wieting, Avirup Sil, Graham Neubig
Semantic parsers map natural language utterances into meaning representations (e. g., programs).
1 code implementation • 30 Apr 2021 • John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick
We train these models on large amounts of data, achieving significantly improved performance from the original papers proposing the methods on a suite of monolingual semantic similarity, cross-lingual semantic similarity, and bitext mining tasks.
4 code implementations • 11 Mar 2021 • Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet nearly all commonly-used models still require an explicit tokenization step.
1 code implementation • NAACL 2021 • Yixin Liu, Graham Neubig, John Wieting
In most cases, the lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task.
1 code implementation • EMNLP 2020 • Kalpesh Krishna, John Wieting, Mohit Iyyer
Modern NLP defines the task of style transfer as modifying the style of a given sentence without appreciably changing its semantics, which implies that the outputs of style transfer systems should be paraphrases of their inputs.
1 code implementation • TACL 2020 • Shuyan Zhou, Shruti Rijhawani, John Wieting, Jaime Carbonell, Graham Neubig
Cross-lingual entity linking (XEL) is the task of finding referents in a target-language knowledge base (KB) for mentions extracted from source-language texts.
2 code implementations • EMNLP 2020 • John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick
Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences.
4 code implementations • ACL 2019 • John Wieting, Kevin Gimpel, Graham Neubig, Taylor Berg-Kirkpatrick
We present a model and methodology for learning paraphrastic sentence embeddings directly from bitext, removing the time-consuming intermediate step of creating paraphrase corpora.
1 code implementation • 14 Sep 2019 • John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig
While most neural machine translation (NMT) systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can substantially improve final translation accuracy.
no code implementations • ACL 2019 • John Wieting, Taylor Berg-Kirkpatrick, Kevin Gimpel, Graham Neubig
While most neural machine translation (NMT)systems are still trained using maximum likelihood estimation, recent work has demonstrated that optimizing systems to directly improve evaluation metrics such as BLEU can significantly improve final translation accuracy.
2 code implementations • NAACL 2019 • Graham Neubig, Zi-Yi Dou, Junjie Hu, Paul Michel, Danish Pruthi, Xinyi Wang, John Wieting
In this paper, we describe compare-mt, a tool for holistic analysis and comparison of the results of systems for language generation tasks such as machine translation.
1 code implementation • ICLR 2019 • John Wieting, Douwe Kiela
We explore various methods for computing sentence representations from pre-trained word embeddings without any training, i. e., using nothing but random parameterizations.
no code implementations • SEMEVAL 2018 • Manasvi Sagarkar, John Wieting, Lifu Tu, Kevin Gimpel
We study the problem of measuring the quality of automatically-generated stories.
1 code implementation • LREC 2018 • Daniel Khashabi, Mark Sammons, Ben Zhou, Tom Redman, Christos Christodoulopoulos, Vivek Srikumar, Nicholas Rizzolo, Lev Ratinov, Guanheng Luo, Quang Do, Chen-Tse Tsai, Subhro Roy, Stephen Mayhew, Zhili Feng, John Wieting, Xiaodong Yu, Yangqiu Song, Shashank Gupta, Shyam Upadhyay, Naveen Arivazhagan, Qiang Ning, Shaoshi Ling, Dan Roth
2 code implementations • NAACL 2018 • Mohit Iyyer, John Wieting, Kevin Gimpel, Luke Zettlemoyer
We propose syntactically controlled paraphrase networks (SCPNs) and use them to generate adversarial examples.
no code implementations • ACL 2018 • John Wieting, Kevin Gimpel
We describe PARANMT-50M, a dataset of more than 50 million English-English sentential paraphrase pairs.
no code implementations • EMNLP 2017 • John Wieting, Jonathan Mallinson, Kevin Gimpel
We consider the problem of learning general-purpose, paraphrastic sentence embeddings in the setting of Wieting et al. (2016b).
no code implementations • ACL 2017 • John Wieting, Kevin Gimpel
We consider the problem of learning general-purpose, paraphrastic sentence embeddings, revisiting the setting of Wieting et al. (2016b).
no code implementations • EMNLP 2016 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu
We present Charagram embeddings, a simple approach for learning character-based compositional models to embed textual sequences.
no code implementations • 25 Nov 2015 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu
We again find that the word averaging models perform well for sentence similarity and entailment, outperforming LSTMs.
no code implementations • 25 Aug 2015 • Daniel Khashabi, John Wieting, Jeffrey Yufei Liu, Feng Liang
Empirical studies have been carried out to compare our work with many constrained clustering algorithms from the literature on both a variety of data sets and under a variety of conditions such as using noisy side information and erroneous k values.
1 code implementation • TACL 2015 • John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu, Dan Roth
The Paraphrase Database (PPDB; Ganitkevitch et al., 2013) is an extensive semantic resource, consisting of a list of phrase pairs with (heuristic) confidence estimates.
no code implementations • 2 Dec 2014 • John Wieting
The second is a supervised approach where a classifier is learned to predict entailment given a concatenated latent vector representation of the word.