Search Results for author: Iz Beltagy

Found 25 papers, 20 papers with code

Zero- and Few-Shot NLP with Pretrained Language Models

no code implementations ACL 2022 Iz Beltagy, Arman Cohan, Robert Logan IV, Sewon Min, Sameer Singh

The ability to efficiently learn from little-to-no data is critical to applying NLP to tasks where data collection is costly or otherwise difficult.

Few-Shot Learning Pretrained Language Models

MSˆ2: Multi-Document Summarization of Medical Studies

1 code implementation EMNLP 2021 Jay DeYoung, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl, Lucy Wang

In support of this goal, we release MSˆ2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20K summaries derived from the scientific literature.

Document Summarization Multi-Document Summarization

What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

1 code implementation12 Apr 2022 Thomas Wang, Adam Roberts, Daniel Hesslow, Teven Le Scao, Hyung Won Chung, Iz Beltagy, Julien Launay, Colin Raffel

In particular, we focus on text-to-text models and experiment with three model architectures (causal/non-causal decoder-only and encoder-decoder), trained with two different pretraining objectives (autoregressive and masked language modeling), and evaluated with and without multitask prompted finetuning.

Language Modelling Masked Language Modeling

Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

no code implementations16 Mar 2022 Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey

Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.

Abstractive Text Summarization

Staged Training for Transformer Language Models

1 code implementation11 Mar 2022 Sheng Shen, Pete Walsh, Kurt Keutzer, Jesse Dodge, Matthew Peters, Iz Beltagy

As an alternative, we consider a staged training setup that begins with a small model and incrementally increases the amount of compute used for training by applying a "growth operator" to increase the model depth and width.

MultiVerS: Improving scientific claim verification with weak supervision and full-document context

2 code implementations Findings (NAACL) 2022 David Wadden, Kyle Lo, Lucy Lu Wang, Arman Cohan, Iz Beltagy, Hannaneh Hajishirzi

Our approach outperforms two competitive baselines on three scientific claim verification datasets, with particularly strong performance in zero / few-shot domain adaptation experiments.

Claim Verification Domain Adaptation +1

Few-Shot Self-Rationalization with Natural Language Prompts

no code implementations Findings (NAACL) 2022 Ana Marasović, Iz Beltagy, Doug Downey, Matthew E. Peters

We identify the right prompting approach by extensively exploring natural language prompts on FEB. Then, by using this prompt and scaling the model size, we demonstrate that making progress on few-shot self-rationalization is possible.

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

1 code implementation ACL 2022 Wen Xiao, Iz Beltagy, Giuseppe Carenini, Arman Cohan

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

Abstractive Text Summarization Document Summarization +1

FLEX: Unifying Evaluation for Few-Shot NLP

2 code implementations NeurIPS 2021 Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy

Few-shot NLP research is highly active, yet conducted in disjoint research threads with evaluation suites that lack challenging-yet-realistic testing setups and fail to employ careful experimental design.

Experimental Design Few-Shot Learning +1

Beyond Paragraphs: NLP for Long Sequences

1 code implementation NAACL 2021 Iz Beltagy, Arman Cohan, Hannaneh Hajishirzi, Sewon Min, Matthew E. Peters

In this tutorial, we aim at bringing interested NLP researchers up to speed about the recent and ongoing techniques for document-level representation learning.

Representation Learning

MS2: Multi-Document Summarization of Medical Studies

1 code implementation13 Apr 2021 Jay DeYoung, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl, Lucy Lu Wang

In support of this goal, we release MS^2 (Multi-Document Summarization of Medical Studies), a dataset of over 470k documents and 20k summaries derived from the scientific literature.

Document Summarization Multi-Document Summarization

CDLM: Cross-Document Language Modeling

2 code implementations Findings (EMNLP) 2021 Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan

We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective.

Citation Recommendation Coreference Resolution +6

SciREX: A Challenge Dataset for Document-Level Information Extraction

1 code implementation ACL 2020 Sarthak Jain, Madeleine van Zuylen, Hannaneh Hajishirzi, Iz Beltagy

It is challenging to create a large-scale information extraction (IE) dataset at the document level since it requires an understanding of the whole document to annotate entities and their document-level relationships that usually span beyond sentences or even sections.

SPECTER: Document-level Representation Learning using Citation-informed Transformers

5 code implementations ACL 2020 Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld

We propose SPECTER, a new method to generate document-level embedding of scientific documents based on pretraining a Transformer language model on a powerful signal of document-level relatedness: the citation graph.

Citation Prediction Document Classification +5

Longformer: The Long-Document Transformer

10 code implementations10 Apr 2020 Iz Beltagy, Matthew E. Peters, Arman Cohan

To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer.

Language Modelling Question Answering +1

Pretrained Language Models for Sequential Sentence Classification

1 code implementation IJCNLP 2019 Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld

As a step toward better document-level understanding, we explore classification of a sequence of sentences into their corresponding categories, a task that requires understanding sentences in context of the document.

Classification General Classification +2

SciBERT: A Pretrained Language Model for Scientific Text

4 code implementations IJCNLP 2019 Iz Beltagy, Kyle Lo, Arman Cohan

Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive.

 Ranked #1 on Sentence Classification on Paper Field (using extra training data)

Citation Intent Classification Dependency Parsing +6

ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing

1 code implementation WS 2019 Mark Neumann, Daniel King, Iz Beltagy, Waleed Ammar

Despite recent advances in natural language processing, many statistical models for processing text perform extremely poorly under domain shift.

Natural Language Processing

Combining Distant and Direct Supervision for Neural Relation Extraction

1 code implementation NAACL 2019 Iz Beltagy, Kyle Lo, Waleed Ammar

In relation extraction with distant supervision, noisy labels make it difficult to train quality models.

Relation Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.