Search Results for author: Matthew E. Peters

Found 27 papers, 18 papers with code

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering

1 code implementation24 May 2023 Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks.

Question Answering Text Generation

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

no code implementations15 May 2023 Rabeeh Karimi Mahabadi, Jaesung Tae, Hamish Ivison, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan

Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various domains with continuous-valued inputs.

Natural Language Understanding Paraphrase Generation +3

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

no code implementations14 Feb 2023 Alexandra Chronopoulou, Matthew E. Peters, Alexander Fraser, Jesse Dodge

We also explore weight averaging of adapters trained on the same domain with different hyper-parameters, and show that it preserves the performance of a PLM on new domains while obtaining strong in-domain results.

Clustering Language Modelling +3

Does Self-Rationalization Improve Robustness to Spurious Correlations?

no code implementations24 Oct 2022 Alexis Ross, Matthew E. Peters, Ana Marasović

Specifically, we evaluate how training self-rationalization models with free-text rationales affects robustness to spurious correlations in fine-tuned encoder-decoder and decoder-only models of six different sizes.

ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts

1 code implementation24 May 2022 Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi

Our method, called ATTEMPT (ATTEntional Mixtures of Prompt Tuning), obtains source prompts as encodings of large-scale source tasks into a small number of parameters and trains an attention module to interpolate the source prompts and a newly initialized target prompt for every instance in the target task.

Few-Shot Learning Language Modelling +1

Extracting Latent Steering Vectors from Pretrained Language Models

1 code implementation Findings (ACL) 2022 Nishant Subramani, Nivedita Suresh, Matthew E. Peters

Experiments show that there exist steering vectors, which, when added to the hidden states of the language model, generate a target sentence nearly perfectly (> 99 BLEU) for English sentences from a variety of domains.

Language Modelling Sentence Similarity +3

Hyperdecoders: Instance-specific decoders for multi-task NLP

1 code implementation15 Mar 2022 Hamish Ivison, Matthew E. Peters

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder.

Efficient Hierarchical Domain Adaptation for Pretrained Language Models

1 code implementation NAACL 2022 Alexandra Chronopoulou, Matthew E. Peters, Jesse Dodge

The remarkable success of large language models has been driven by dense models trained on massive unlabeled, unstructured corpora.

Domain Adaptation Language Modelling

Few-Shot Self-Rationalization with Natural Language Prompts

1 code implementation Findings (NAACL) 2022 Ana Marasović, Iz Beltagy, Doug Downey, Matthew E. Peters

We identify the right prompting approach by extensively exploring natural language prompts on FEB. Then, by using this prompt and scaling the model size, we demonstrate that making progress on few-shot self-rationalization is possible.

Beyond Paragraphs: NLP for Long Sequences

1 code implementation NAACL 2021 Iz Beltagy, Arman Cohan, Hannaneh Hajishirzi, Sewon Min, Matthew E. Peters

In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for document-level representation learning.

Representation Learning

Competency Problems: On Finding and Removing Artifacts in Language Data

no code implementations EMNLP 2021 Matt Gardner, William Merrill, Jesse Dodge, Matthew E. Peters, Alexis Ross, Sameer Singh, Noah A. Smith

In this work we argue that for complex language understanding tasks, all simple feature correlations are spurious, and we formalize this notion into a class of problems which we call competency problems.

CDLM: Cross-Document Language Modeling

2 code implementations Findings (EMNLP) 2021 Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan

We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective.

Citation Recommendation Coreference Resolution +6

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

1 code implementation Findings (ACL) 2021 Alexis Ross, Ana Marasović, Matthew E. Peters

Humans have been shown to give contrastive explanations, which explain why an observed event happened rather than some other counterfactual event (the contrast case).

Multiple-choice Question Answering +3

Longformer: The Long-Document Transformer

16 code implementations10 Apr 2020 Iz Beltagy, Matthew E. Peters, Arman Cohan

To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer.

Language Modelling Question Answering +1

Adversarial Filters of Dataset Biases

1 code implementation ICML 2020 Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Yejin Choi

Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples.

Natural Language Inference

Knowledge Enhanced Contextual Word Representations

1 code implementation IJCNLP 2019 Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith

Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities.

Entity Linking Entity Typing +3

Transfer Learning in Natural Language Processing

no code implementations NAACL 2019 Sebastian Ruder, Matthew E. Peters, Swabha Swayamdipta, Thomas Wolf

The classic supervised machine learning paradigm is based on learning in isolation, a single predictive model for a task using a single dataset.

Transfer Learning Word Embeddings

Linguistic Knowledge and Transferability of Contextual Representations

no code implementations NAACL 2019 Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith

Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language.

Language Modelling

To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

no code implementations WS 2019 Matthew E. Peters, Sebastian Ruder, Noah A. Smith

While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task.

Transfer Learning

Dissecting Contextual Word Embeddings: Architecture and Representation

no code implementations EMNLP 2018 Matthew E. Peters, Mark Neumann, Luke Zettlemoyer, Wen-tau Yih

Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks.

Word Embeddings

Deep contextualized word representations

46 code implementations NAACL 2018 Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).

Ranked #4 on Citation Intent Classification on ACL-ARC (using extra training data)

Citation Intent Classification Conversational Response Selection +7

Cannot find the paper you are looking for? You can Submit a new open access paper.