no code implementations • 19 Jul 2023 • Hao Peng, Qingqing Cao, Jesse Dodge, Matthew E. Peters, Jared Fernandez, Tom Sherborne, Kyle Lo, Sam Skjonsberg, Emma Strubell, Darrell Plessas, Iz Beltagy, Evan Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi
In response, we introduce Pentathlon, a benchmark for holistic and realistic evaluation of model efficiency.
1 code implementation • 24 May 2023 • Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan
The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks.
no code implementations • 15 May 2023 • Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan
Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs.
no code implementations • 14 Feb 2023 • Alexandra Chronopoulou, Matthew E. Peters, Alexander Fraser, Jesse Dodge
We also explore weight averaging of adapters trained on the same domain with different hyper-parameters, and show that it preserves the performance of a PLM on new domains while obtaining strong in-domain results.
no code implementations • 24 Oct 2022 • Alexis Ross, Matthew E. Peters, Ana Marasović
Specifically, we evaluate how training self-rationalization models with free-text rationales affects robustness to spurious correlations in fine-tuned encoder-decoder and decoder-only models of six different sizes.
1 code implementation • 24 May 2022 • Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
Our method, called ATTEMPT (ATTEntional Mixtures of Prompt Tuning), obtains source prompts as encodings of large-scale source tasks into a small number of parameters and trains an attention module to interpolate the source prompts and a newly initialized target prompt for every instance in the target task.
1 code implementation • Findings (ACL) 2022 • Nishant Subramani, Nivedita Suresh, Matthew E. Peters
Experiments show that there exist steering vectors, which, when added to the hidden states of the language model, generate a target sentence nearly perfectly (> 99 BLEU) for English sentences from a variety of domains.
1 code implementation • 15 Mar 2022 • Hamish Ivison, Matthew E. Peters
We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder.
1 code implementation • NAACL 2022 • Alexandra Chronopoulou, Matthew E. Peters, Jesse Dodge
The remarkable success of large language models has been driven by dense models trained on massive unlabeled, unstructured corpora.
1 code implementation • Findings (NAACL) 2022 • Ana Marasović, Iz Beltagy, Doug Downey, Matthew E. Peters
We identify the right prompting approach by extensively exploring natural language prompts on FEB. Then, by using this prompt and scaling the model size, we demonstrate that making progress on few-shot self-rationalization is possible.
1 code implementation • ACL 2022 • Alexis Ross, Tongshuang Wu, Hao Peng, Matthew E. Peters, Matt Gardner
We craft a set of operations to modify the control codes, which in turn steer generation towards targeted attributes.
1 code implementation • NAACL 2021 • Iz Beltagy, Arman Cohan, Hannaneh Hajishirzi, Sewon Min, Matthew E. Peters
In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for document-level representation learning.
no code implementations • EMNLP 2021 • Matt Gardner, William Merrill, Jesse Dodge, Matthew E. Peters, Alexis Ross, Sameer Singh, Noah A. Smith
In this work we argue that for complex language understanding tasks, all simple feature correlations are spurious, and we formalize this notion into a class of problems which we call competency problems.
2 code implementations • Findings (EMNLP) 2021 • Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan
We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective.
Ranked #1 on
Citation Recommendation
on AAN test
1 code implementation • Findings (ACL) 2021 • Alexis Ross, Ana Marasović, Matthew E. Peters
Humans have been shown to give contrastive explanations, which explain why an observed event happened rather than some other counterfactual event (the contrast case).
1 code implementation • EMNLP 2020 • Orion Weller, Nicholas Lourie, Matt Gardner, Matthew E. Peters
Typically, machine learning systems solve new tasks by training on thousands of examples.
16 code implementations • 10 Apr 2020 • Iz Beltagy, Matthew E. Peters, Arman Cohan
To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer.
Ranked #2 on
Question Answering
on WikiHop
1 code implementation • ICML 2020 • Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Yejin Choi
Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples.
1 code implementation • IJCNLP 2019 • Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith
Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities.
Ranked #11 on
Entity Linking
on AIDA-CoNLL
1 code implementation • ACL 2019 • Robert Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge.
1 code implementation • 17 Jun 2019 • Robert L. Logan IV, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
Modeling human language requires the ability to not only generate fluent text but also encode factual knowledge.
no code implementations • NAACL 2019 • Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf
The classic supervised machine learning paradigm is based on learning in isolation, a single predictive model for a task using a single dataset.
no code implementations • NAACL 2019 • Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language.
no code implementations • WS 2019 • Matthew E. Peters, Sebastian Ruder, Noah A. Smith
While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task.
no code implementations • EMNLP 2018 • Matthew E. Peters, Mark Neumann, Luke Zettlemoyer, Wen-tau Yih
Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks.
46 code implementations • NAACL 2018 • Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).
Ranked #4 on
Citation Intent Classification
on ACL-ARC
(using extra training data)
Citation Intent Classification
Conversational Response Selection
+7
2 code implementations • ACL 2017 • Matthew E. Peters, Waleed Ammar, Chandra Bhagavatula, Russell Power
Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks.
Ranked #50 on
Named Entity Recognition (NER)
on CoNLL 2003 (English)