no code implementations • LREC 2012 • Alex Denis, re, Ingrid Falk, Claire Gardent, Laura Perez-Beltrachini
There has been much debate, both theoretical and practical, on how to link ontologies and lexicons in natural language processing (NLP) applications.
no code implementations • COLING 2016 • Laura Perez-Beltrachini, Rania Sayed, Claire Gardent
In Natural Language Generation (NLG), one important limitation is the lack of common benchmarks on which to train, evaluate and compare data-to-text generators.
no code implementations • CL 2017 • Claire Gardent, Laura Perez-Beltrachini
Although there has been much work in recent years on data-driven natural language generation, little attention has been paid to the fine-grained interactions that arise during microplanning between aggregation, surface realization, and sentence segmentation.
no code implementations • WS 2017 • Laura Perez-Beltrachini, Claire Gardent
Recently, several data-sets associating data to text have been created to train data-to-text surface realisers.
no code implementations • ACL 2017 • Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini
In this paper, we present a novel framework for semi-automatically creating linguistically challenging micro-planning data-to-text corpora from existing Knowledge Bases.
no code implementations • WS 2017 • Claire Gardent, Anastasia Shimorina, Shashi Narayan, Laura Perez-Beltrachini
The WebNLG challenge consists in mapping sets of RDF triples to text.
1 code implementation • NAACL 2018 • Laura Perez-Beltrachini, Mirella Lapata
A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e. g., facts in a database) and associated texts.
2 code implementations • WS 2018 • Diego Marcheggiani, Laura Perez-Beltrachini
Most previous work on neural text generation from graph-structured data relies on standard sequence-to-sequence methods.
Ranked #2 on Data-to-Text Generation on SR11Deep
1 code implementation • ACL 2019 • Laura Perez-Beltrachini, Yang Liu, Mirella Lapata
Existing neural generation approaches create multi-sentence text as a single sequence.
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on Extreme Summarization on GEM-XSum
Abstractive Text Summarization Cross-Lingual Abstractive Summarization +5
no code implementations • 16 Jun 2021 • Simon Mille, Kaustubh D. Dhole, Saad Mahamood, Laura Perez-Beltrachini, Varun Gangal, Mihir Kale, Emiel van Miltenburg, Sebastian Gehrmann
By applying this framework to the GEM generation benchmark, we propose an evaluation suite made of 80 challenge sets, demonstrate the kinds of analyses that it enables and shed light onto the limits of current generation models.
no code implementations • Journal of Artificial Intelligence Research 2021 • Laura Perez-Beltrachini, Mirella Lapata
The ability to convey relevant and diverse information is critical in multi-documentsummarization and yet remains elusive for neural seq-to-seq models whose outputs are of-ten redundant and fail to correctly cover important details.
Ranked #3 on Multi-Document Summarization on Multi-News
1 code implementation • EMNLP 2021 • Laura Perez-Beltrachini, Mirella Lapata
We present a cross-lingual summarisation corpus with long documents in a source language associated with multi-sentence summaries in a target language.
no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou
This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.
1 code implementation • 28 Jan 2023 • Laura Perez-Beltrachini, Parag Jain, Emilio Monti, Mirella Lapata
In this paper, we are interested in developing semantic parsers which understand natural language questions embedded in a conversation with a user and ground them to formal queries over definitions in a general purpose knowledge graph (KG) with very large vocabularies (covering thousands of concept names and relations, and millions of entities).
1 code implementation • 20 Feb 2023 • Hanxu Hu, Yunqing Liu, Zhongyi Yu, Laura Perez-Beltrachini
In this work we study user controlled table-to-text generation where users explore the content in a table by selecting cells and reading a natural language description thereof automatically produce by a natural language generator.
1 code implementation • 27 Feb 2024 • Huajian Zhang, Yumo Xu, Laura Perez-Beltrachini
We study existing approaches to leverage off-the-shelf Natural Language Inference (NLI) models for the evaluation of summary faithfulness and argue that these are sub-optimal due to the granularity level considered for premises and hypotheses.
no code implementations • 8 Apr 2024 • Giwon Hong, Aryo Pradipta Gema, Rohit Saxena, Xiaotang Du, Ping Nie, Yu Zhao, Laura Perez-Beltrachini, Max Ryabinin, Xuanli He, Clémentine Fourrier, Pasquale Minervini
Large Language Models (LLMs) have transformed the Natural Language Processing (NLP) landscape with their remarkable ability to understand and generate human-like text.