no code implementations • EMNLP 2021 • Sascha Rothe, Joshua Maynez, Shashi Narayan
Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al., 2019), but are outperformed by task-specific pretraining objectives like predicting extracted gap sentences on summarization (Zhang et al., 2020).
no code implementations • Findings (NAACL) 2022 • Yongtai Liu, Joshua Maynez, Gonçalo Simões, Shashi Narayan
We present DADS, a novel Data Augmentation technique for low-resource Dialogue Summarization.
no code implementations • 12 Oct 2023 • Polina Zablotskaia, Misha Khalman, Rishabh Joshi, Livio Baldini Soares, Shoshana Jakobovits, Joshua Maynez, Shashi Narayan
Despite the recent advances in abstractive text summarization, current summarization models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application.
no code implementations • 23 May 2023 • Fantine Huot, Joshua Maynez, Chris Alberti, Reinald Kim Amplayo, Priyanka Agrawal, Constanza Fierro, Shashi Narayan, Mirella Lapata
Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages.
no code implementations • 28 Apr 2023 • Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata
While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content.
no code implementations • 17 Apr 2023 • Polina Zablotskaia, Du Phan, Joshua Maynez, Shashi Narayan, Jie Ren, Jeremiah Liu
Modern deep models for summarization attains impressive benchmark performance, but they are prone to generating miscalibrated predictive uncertainty.
no code implementations • 20 Dec 2022 • Evgeniia Razumovskaia, Joshua Maynez, Annie Louis, Mirella Lapata, Shashi Narayan
Previous work has demonstrated the effectiveness of planning for story generation exclusively in a monolingual setting focusing primarily on English.
no code implementations • 20 Dec 2022 • Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
no code implementations • 31 Oct 2022 • Reinald Kim Amplayo, Kellie Webster, Michael Collins, Dipanjan Das, Shashi Narayan
Large language models (LLMs) have been shown to perform well in answering questions and in producing long-form texts, both in few-shot closed-book settings.
no code implementations • 30 Sep 2022 • Yao Zhao, Misha Khalman, Rishabh Joshi, Shashi Narayan, Mohammad Saleh, Peter J. Liu
Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences.
Ranked #1 on
Abstractive Text Summarization
on CNN / Daily Mail
abstractive question answering
Abstractive Text Summarization
+5
no code implementations • 1 Aug 2022 • Reinald Kim Amplayo, Peter J. Liu, Yao Zhao, Shashi Narayan
Specifically, We treat sentences as basic units of matching instead of tokens, and use a sentence matching function to soft-match candidate and reference sentences.
1 code implementation • 1 Jul 2022 • Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Anders Sandholm, Dipanjan Das, Mirella Lapata
The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details.
1 code implementation • ACL 2022 • Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.
1 code implementation • Findings (EMNLP) 2021 • Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas
We show via data analysis that it's not only the models which are to blame: more than 27% of facts mentioned in the gold summaries of MiRANews are better grounded on assisting documents than in the main source articles.
no code implementations • ACL 2021 • Rahul Aralikatte, Shashi Narayan, Joshua Maynez, Sascha Rothe, Ryan Mcdonald
Professional summaries are written with document-level information, such as the theme of the document, in mind.
no code implementations • 15 Apr 2021 • Shashi Narayan, Yao Zhao, Joshua Maynez, Gonçalo Simoes, Vitaly Nikolaev, Ryan Mcdonald
Moreover, we demonstrate empirically that planning with entity chains provides a mechanism to control hallucinations in abstractive summaries.
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on
Extreme Summarization
on GEM-XSum
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+5
1 code implementation • EMNLP 2020 • Shashi Narayan, Joshua Maynez, Jakub Adamek, Daniele Pighin, Blaž Bratanič, Ryan Mcdonald
We propose encoder-centric stepwise models for extractive summarization using structured transformers -- HiBERT and Extended Transformers.
2 code implementations • ACL 2020 • Joshua Maynez, Shashi Narayan, Bernd Bohnet, Ryan Mcdonald
It is well known that the standard likelihood training and approximate decoding objectives in neural text generation models lead to less human-like responses for open-ended tasks such as language modeling and story generation.
no code implementations • 23 Apr 2020 • Shashi Narayan, Gonçalo Simoes, Ji Ma, Hannah Craighead, Ryan Mcdonald
Recent trends in natural language processing using pretraining have shifted focus towards pretraining and fine-tuning approaches for text generation.
no code implementations • 19 Oct 2019 • Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
We address the issue of hallucination in data-to-text generation, i. e., reducing the generation of text that is unsupported by the source.
6 code implementations • TACL 2020 • Sascha Rothe, Shashi Narayan, Aliaksei Severyn
Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing.
Ranked #1 on
Split and Rephrase
on WikiSplit
1 code implementation • 19 Jul 2019 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
We introduce 'extreme summarization', a new single-document summarization task which aims at creating a short, one-sentence news summary answering the question ``What is the article about?''.
1 code implementation • ACL 2019 • Hardy, Shashi Narayan, Andreas Vlachos
There has been substantial progress in summarization research enabled by the availability of novel, often large-scale, datasets and recent advances on neural network-based approaches.
1 code implementation • NAACL 2019 • Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen
We present a new neural model for text summarization that first extracts sentences from a document and then compresses them.
1 code implementation • EMNLP 2018 • Maximin Coavoux, Shashi Narayan, Shay B. Cohen
This article deals with adversarial attacks towards deep learning systems for Natural Language Processing (NLP), in the context of privacy protection.
3 code implementations • EMNLP 2018 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach.
Ranked #9 on
Text Summarization
on X-Sum
no code implementations • COLING 2018 • Joana Ribeiro, Shashi Narayan, Shay B. Cohen, Xavier Carreras
We show that the general problem of string transduction can be reduced to the problem of sequence labeling.
1 code implementation • ACL 2018 • Shashi Narayan, Ronald Cardenas, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata, Jiangsheng Yu, Yi Chang
Document modeling is essential to a variety of natural language understanding tasks.
no code implementations • NAACL 2018 • Claire Gardent, Shashi Narayan
Each text production task raises a slightly different communication goal (e. g, how to take the dialogue context into account when producing a dialogue turn; how to detect and merge relevant information when summarising a text; or how to produce a well-formed text that correctly capture the information contained in some input data in the case of data-to-text generation).
1 code implementation • NAACL 2018 • Shashi Narayan, Shay B. Cohen, Mirella Lapata
In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective.
Ranked #13 on
Extractive Text Summarization
on CNN / Daily Mail
no code implementations • WS 2017 • Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini
The WebNLG challenge consists in mapping sets of RDF triples to text.
2 code implementations • EMNLP 2017 • Shashi Narayan, Claire Gardent, Shay B. Cohen, Anastasia Shimorina
We propose a new sentence simplification task (Split-and-Rephrase) where the aim is to split a complex sentence into a meaning preserving sequence of shorter sentences.
no code implementations • ACL 2017 • Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini
In this paper, we present a novel framework for semi-automatically creating linguistically challenging micro-planning data-to-text corpora from existing Knowledge Bases.
1 code implementation • 14 Apr 2017 • Shashi Narayan, Nikos Papasarantopoulos, Shay B. Cohen, Mirella Lapata
Most extractive summarization methods focus on the main body of the document from which sentences need to be extracted.
no code implementations • EACL 2017 • Renars Liepins, Ulrich Germann, Guntis Barzdins, Alex Birch, ra, Steve Renals, Susanne Weber, Peggy van der Kreeft, Herv{\'e} Bourlard, Jo{\~a}o Prieto, Ond{\v{r}}ej Klejch, Peter Bell, Alex Lazaridis, ros, Alfonso Mendes, Sebastian Riedel, Mariana S. C. Almeida, Pedro Balage, Shay B. Cohen, Tomasz Dwojak, Philip N. Garner, Andreas Giefer, Marcin Junczys-Dowmunt, Hina Imran, David Nogueira, Ahmed Ali, Mir, Sebasti{\~a}o a, Andrei Popescu-Belis, Lesly Miculicich Werlen, Nikos Papasarantopoulos, Abiola Obamuyide, Clive Jones, Fahim Dalvi, Andreas Vlachos, Yang Wang, Sibo Tong, Rico Sennrich, Nikolaos Pappas, Shashi Narayan, Marco Damonte, Nadir Durrani, Sameer Khurana, Ahmed Abdelali, Hassan Sajjad, Stephan Vogel, David Sheppey, Chris Hernon, Jeff Mitchell
We present the first prototype of the SUMMA Platform: an integrated platform for multilingual media monitoring.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • ACL 2016 • Shashi Narayan, Shay B. Cohen
We describe a search algorithm for optimizing the number of latent states when estimating latent-variable PCFGs with spectral methods.
no code implementations • WS 2016 • Shashi Narayan, Siva Reddy, Shay B. Cohen
One of the limitations of semantic parsing approaches to open-domain question answering is the lexicosyntactic gap between natural language questions and knowledge base entries -- there are many ways to ask a question, all with the same answer.
no code implementations • TACL 2016 • Dominique Osborne, Shashi Narayan, Shay B. Cohen
Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views.
1 code implementation • WS 2016 • Shashi Narayan, Claire Gardent
We present a novel approach to sentence simplification which departs from previous work in two main ways.
Ranked #2 on
Text Simplification
on PWKP / WikiSmall
no code implementations • EMNLP 2015 • Shashi Narayan, Shay B. Cohen
We describe an approach to create a diverse set of predictions with spectral learning of latent-variable PCFGs (L-PCFGs).