Search Results for author: Sewon Min

Found 40 papers, 30 papers with code

Zero- and Few-Shot NLP with Pretrained Language Models

no code implementations ACL 2022 Iz Beltagy, Arman Cohan, Robert Logan IV, Sewon Min, Sameer Singh

The ability to efficiently learn from little-to-no data is critical to applying NLP to tasks where data collection is costly or otherwise difficult.

Few-Shot Learning

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

no code implementations30 Jan 2024 Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi

The $\infty$-gram framework and infini-gram engine enable us to conduct many novel and interesting analyses of human-written and machine-generated text: we find that the $\infty$-gram LM has fairly high accuracy for next-token prediction (47%), and can complement neural LLMs to greatly reduce their language modeling perplexities.

Language Modelling

In-Context Pretraining: Language Modeling Beyond Document Boundaries

no code implementations16 Oct 2023 Weijia Shi, Sewon Min, Maria Lomeli, Chunting Zhou, Margaret Li, Rich James, Xi Victoria Lin, Noah A. Smith, Luke Zettlemoyer, Scott Yih, Mike Lewis

Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to document completion.

In-Context Learning Language Modelling +1

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

no code implementations2 Oct 2023 Qingqing Cao, Sewon Min, Yizhong Wang, Hannaneh Hajishirzi

Retrieval augmentation addresses many critical problems in large language models such as hallucination, staleness, and privacy leaks.

Hallucination Retrieval

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

1 code implementation8 Aug 2023 Sewon Min, Suchin Gururangan, Eric Wallace, Hannaneh Hajishirzi, Noah A. Smith, Luke Zettlemoyer

SILO is built by (1) training a parametric LM on Open License Corpus (OLC), a new corpus we curate with 228B tokens of public domain and permissively licensed text and (2) augmenting it with a more general and easily modifiable nonparametric datastore (e. g., containing copyrighted books or news) that is only queried during inference.

Language Modelling Sentence

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

3 code implementations23 May 2023 Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi

Evaluating the factuality of long-form text generated by large language models (LMs) is non-trivial because (1) generations often contain a mixture of supported and unsupported pieces of information, making binary judgments of quality inadequate, and (2) human evaluation is time-consuming and costly.

Language Modelling Retrieval +1

REPLUG: Retrieval-Augmented Black-Box Language Models

1 code implementation30 Jan 2023 Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih

We introduce REPLUG, a retrieval-augmented language modeling framework that treats the language model (LM) as a black box and augments it with a tuneable retrieval model.

Language Modelling Multi-task Language Understanding +2

Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters

2 code implementations20 Dec 2022 Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer, Huan Sun

Chain-of-Thought (CoT) prompting can dramatically improve the multi-step reasoning abilities of large language models (LLMs).

Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations

2 code implementations19 Dec 2022 Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, Hannaneh Hajishirzi

Although large language models can be prompted for both zero- and few-shot learning, performance drops significantly when no demonstrations are available.

Few-Shot Learning In-Context Learning +1

Nonparametric Masked Language Modeling

1 code implementation2 Dec 2022 Sewon Min, Weijia Shi, Mike Lewis, Xilun Chen, Wen-tau Yih, Hannaneh Hajishirzi, Luke Zettlemoyer

Existing language models (LMs) predict tokens with a softmax over a finite vocabulary, which can make it difficult to predict rare tokens or phrases.

Language Modelling Masked Language Modeling +2

CREPE: Open-Domain Question Answering with False Presuppositions

1 code implementation30 Nov 2022 Xinyan Velocity Yu, Sewon Min, Luke Zettlemoyer, Hannaneh Hajishirzi

We find that 25% of questions contain false presuppositions, and provide annotations for these presuppositions and their corrections.

Open-Domain Question Answering

Measuring and Narrowing the Compositionality Gap in Language Models

1 code implementation7 Oct 2022 Ofir Press, Muru Zhang, Sewon Min, Ludwig Schmidt, Noah A. Smith, Mike Lewis

We investigate the ability of language models to perform compositional reasoning tasks where the overall solution depends on correctly composing the answers to sub-problems.

Question Answering

Re-Examining Calibration: The Case of Question Answering

1 code implementation25 May 2022 Chenglei Si, Chen Zhao, Sewon Min, Jordan Boyd-Graber

Building on those observations, we propose a new calibration metric, MacroCE, that better captures whether the model assigns low confidence to wrong predictions and high confidence to correct predictions.

Open-Domain Question Answering

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

1 code implementation25 Feb 2022 Sewon Min, Xinxi Lyu, Ari Holtzman, Mikel Artetxe, Mike Lewis, Hannaneh Hajishirzi, Luke Zettlemoyer

Large language models (LMs) are able to in-context learn -- perform a new task via inference alone by conditioning on a few input-label pairs (demonstrations) and making predictions for new inputs.

In-Context Learning

MetaICL: Learning to Learn In Context

2 code implementations NAACL 2022 Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi

We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learning on a large set of training tasks.

Few-Shot Learning In-Context Learning +4

FaVIQ: FAct Verification from Information-seeking Questions

1 code implementation ACL 2022 Jungsoo Park, Sewon Min, Jaewoo Kang, Luke Zettlemoyer, Hannaneh Hajishirzi

Claims in FAVIQ are verified to be natural, contain little lexical bias, and require a complete understanding of the evidence for verification.

Fact Checking Fact Verification +1

RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

no code implementations NAACL 2021 Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.

Machine Reading Comprehension Natural Questions +3

Beyond Paragraphs: NLP for Long Sequences

1 code implementation NAACL 2021 Iz Beltagy, Arman Cohan, Hannaneh Hajishirzi, Sewon Min, Matthew E. Peters

In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for document-level representation learning.

Representation Learning

Joint Passage Ranking for Diverse Multi-Answer Retrieval

no code implementations EMNLP 2021 Sewon Min, Kenton Lee, Ming-Wei Chang, Kristina Toutanova, Hannaneh Hajishirzi

We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a given question.

Answer Generation Passage Ranking +4

RECONSIDER: Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering

1 code implementation21 Oct 2020 Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wen-tau Yih

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples.

Machine Reading Comprehension Natural Questions +3

Efficient One-Pass End-to-End Entity Linking for Questions

3 code implementations EMNLP 2020 Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, Wen-tau Yih

We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass.

Entity Linking Question Answering

UnifiedQA: Crossing Format Boundaries With a Single QA System

2 code implementations Findings of the Association for Computational Linguistics 2020 Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi

As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UnifiedQA, that performs surprisingly well across 17 QA datasets spanning 4 diverse formats.

Common Sense Reasoning Language Modelling +3

AmbigQA: Answering Ambiguous Open-domain Questions

2 code implementations EMNLP 2020 Sewon Min, Julian Michael, Hannaneh Hajishirzi, Luke Zettlemoyer

Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer.

Open-Domain Question Answering Weakly-supervised Learning

Dense Passage Retrieval for Open-Domain Question Answering

17 code implementations EMNLP 2020 Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, Wen-tau Yih

Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method.

Open-Domain Question Answering Passage Retrieval +1

Knowledge Guided Text Retrieval and Reading for Open Domain Question Answering

7 code implementations10 Nov 2019 Sewon Min, Danqi Chen, Luke Zettlemoyer, Hannaneh Hajishirzi

We introduce an approach for open-domain question answering (QA) that retrieves and reads a passage graph, where vertices are passages of text and edges represent relationships that are derived from an external knowledge base or co-occurrence in the same article.

Natural Questions Open-Domain Question Answering +5

On Making Reading Comprehension More Comprehensive

no code implementations WS 2019 Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon Min

In this work, we justify a question answering approach to reading comprehension and describe the various kinds of questions one might use to more fully test a system{'}s comprehension of a passage, moving beyond questions that only probe local predicate-argument structures.

Machine Reading Comprehension Question Answering +1

Question Answering is a Format; When is it Useful?

no code implementations25 Sep 2019 Matt Gardner, Jonathan Berant, Hannaneh Hajishirzi, Alon Talmor, Sewon Min

In this opinion piece, we argue that question answering should be considered a format which is sometimes useful for studying particular phenomena, not a phenomenon or task in itself.

Machine Translation Question Answering +4

Neural Speed Reading via Skim-RNN

1 code implementation ICLR 2018 Minjoon Seo, Sewon Min, Ali Farhadi, Hannaneh Hajishirzi

Inspired by the principles of speed reading, we introduce Skim-RNN, a recurrent neural network (RNN) that dynamically decides to update only a small fraction of the hidden state for relatively unimportant input tokens.

Question Answering through Transfer Learning from Large Fine-grained Supervision Data

1 code implementation ACL 2017 Sewon Min, Minjoon Seo, Hannaneh Hajishirzi

We show that the task of question answering (QA) can significantly benefit from the transfer learning of models trained on a different large, fine-grained QA dataset.

Question Answering Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.