no code implementations • HumEval (ACL) 2022 • Simone Balloccu, Ehud Reiter
However, previous work on apps evaluation only focused on dietary outcomes, ignoring users’ emotional state despite its influence on eating habits.
no code implementations • INLG (ACL) 2021 • Anya Belz, Anastasia Shimorina, Shubham Agarwal, Ehud Reiter
The NLP field has recently seen a substantial increase in work related to reproducibility of results, and more generally in recognition of the importance of having shared definitions and practices relating to evaluation.
no code implementations • INLG (ACL) 2021 • Sameen Maruf, Ingrid Zukerman, Ehud Reiter, Gholamreza Haffari
We offer an approach to explain Decision Tree (DT) predictions by addressing potential conflicts between aspects of these predictions and plausible expectations licensed by background information.
no code implementations • ACL (NL4XAI, INLG) 2020 • Conor Hennessy, Alberto Bugarín, Ehud Reiter
In order to increase trust in the usage of Bayesian Networks and to cement their role as a model which can aid in critical decision making, the challenge of explainability must be faced.
no code implementations • INLG (ACL) 2020 • Anya Belz, Shubham Agarwal, Anastasia Shimorina, Ehud Reiter
Across NLP, a growing body of work is looking at the issue of reproducibility.
no code implementations • INLG (ACL) 2020 • Ehud Reiter, Craig Thomson
We propose a shared task on methodologies and algorithms for evaluating the accuracy of generated texts, specifically summaries of basketball games produced from basketball box score and other game data.
1 code implementation • INLG (ACL) 2020 • Wael Abed, Ehud Reiter
In this paper, we explain the challenges of the core grammar, provide a lexical resource, and implement the first language functions for the Arabic language.
no code implementations • 23 Oct 2024 • Jaime Sevilla, Nikolay Babakov, Ehud Reiter, Alberto Bugarin
In this paper, we propose a model for building natural language explanations for Bayesian Network Reasoning in terms of factor arguments, which are argumentation graphs of flowing evidence, relating the observed evidence to a target variable we want to learn about.
no code implementations • 12 Jul 2024 • Nikolay Babakov, Ehud Reiter, Alberto Bugarin
We also propose an approach to check the contamination of BNs in LLM, which shows that some widely known BNs are inapplicable for testing the LLM usage for BNs structure elicitation.
no code implementations • 23 Jun 2024 • Mengxuan Sun, Ehud Reiter, Anne E Kiltie, George Ramsay, Lisa Duncan, Peter Murchie, Rosalind Adam
Electronic health records contain detailed information about the medical condition of patients, but they are difficult for patients to understand even if they have access to them.
no code implementations • 28 May 2024 • Silvia García Méndez, Milagros Fernández Gavilanes, Enrique Costa Montenegro, Jonathan Juncal Martínez, Francisco Javier González Castaño, Ehud Reiter
We present an automatic text expansion system to generate English sentences, which performs automatic Natural Language Generation (NLG) by combining linguistic rules with statistical approaches.
no code implementations • 4 May 2024 • Vadim Liventsev, Vivek Kumar, Allmin Pradhap Singh Susaiyah, Zixiu Wu, Ivan Rodin, Asfand Yaar, Simone Balloccu, Marharyta Beraziuk, Sebastiano Battiato, Giovanni Maria Farinella, Aki Härmä, Rim Helaoui, Milan Petkovic, Diego Reforgiato Recupero, Ehud Reiter, Daniele Riboni, Raymond Sterling
The use of machine learning in Healthcare has the potential to improve patient outcomes as well as broaden the reach and affordability of Healthcare.
1 code implementation • 5 Apr 2024 • Barkavi Sundararajan, Somayajulu Sripada, Ehud Reiter
Neural Table-to-Text models tend to hallucinate, producing texts that contain factual errors.
no code implementations • 31 Jan 2024 • Adarsa Sivaprasad, Ehud Reiter
This paper addresses the unique challenges associated with uncertainty quantification in AI models when applied to patient-facing contexts within healthcare.
Explainable artificial intelligence
Explainable Artificial Intelligence (XAI)
+2
no code implementations • 17 Jan 2024 • Kittipitch Kuptavanich, Ehud Reiter, Kees Van Deemter, Advaith Siddharthan
We are developing techniques to generate summary descriptions of sets of objects.
no code implementations • 16 Jan 2024 • Simone Balloccu, Ehud Reiter, Vivek Kumar, Diego Reforgiato Recupero, Daniele Riboni
We release HAI-coaching, the first expert-annotated nutrition counselling dataset containing ~2. 4K dietary struggles from crowd workers, and ~97K related supportive texts generated by ChatGPT.
no code implementations • 18 Sep 2023 • Adarsa Sivaprasad, Ehud Reiter, Nava Tintarev, Nir Oren
A task based evaluation of mental models of these participants provide valuable feedback to enhance narrative global explanations.
no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.
no code implementations • 17 Nov 2022 • Aleksandar Savkov, Francesco Moramarco, Alex Papadopoulos Korfiatis, Mark Perera, Anya Belz, Ehud Reiter
Evaluating automatically generated text is generally hard due to the inherently subjective nature of many aspects of the output quality.
7 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.
no code implementations • 23 Jun 2022 • Simone Balloccu, Ehud Reiter
Visual representation of data like charts and tables can be challenging to understand for readers.
no code implementations • NAACL 2022 • Tom Knoll, Francesco Moramarco, Alex Papadopoulos Korfiatis, Rachel Young, Claudia Ruffini, Mark Perera, Christian Perstl, Ehud Reiter, Anya Belz, Aleksandar Savkov
A growing body of work uses Natural Language Processing (NLP) methods to automatically generate medical notes from audio recordings of doctor-patient consultations.
1 code implementation • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022 • Zixiu Wu, Simone Balloccu, Vivek Kumar, Rim Helaoui, Ehud Reiter, Diego Reforgiato Recupero, Daniele Riboni
Research on natural language processing for counselling dialogue analysis has seen substantial development in recent years, but access to this area remains extremely limited due to the lack of publicly available expert-annotated therapy conversations.
no code implementations • ACL 2022 • Francesco Moramarco, Alex Papadopoulos Korfiatis, Mark Perera, Damir Juric, Jack Flann, Ehud Reiter, Anya Belz, Aleksandar Savkov
In recent years, machine learning models have rapidly become better at generating clinical consultation notes; yet, there is little work on how to properly evaluate the generated consultation notes to understand the impact they may have on both the clinician using them and the patient's clinical safety.
1 code implementation • INLG (ACL) 2021 • Craig Thomson, Ehud Reiter
The Shared Task on Evaluating Accuracy focused on techniques (both manual and automatic) for evaluating the factual accuracy of texts produced by neural NLG systems, in a sports-reporting domain.
no code implementations • EACL (HumEval) 2021 • Francesco Moramarco, Alex Papadopoulos Korfiatis, Aleksandar Savkov, Ehud Reiter
We time this and find that it is faster than writing the note from scratch.
no code implementations • EACL (HumEval) 2021 • Francesco Moramarco, Damir Juric, Aleksandar Savkov, Ehud Reiter
We propose a method for evaluating the quality of generated text by asking evaluators to count facts, and computing precision, recall, f-score, and accuracy from the raw counts.
1 code implementation • EACL 2021 • Anya Belz, Shubham Agarwal, Anastasia Shimorina, Ehud Reiter
Against the background of what has been termed a reproducibility crisis in science, the NLP field is becoming increasingly interested in, and conscientious about, the reproducibility of its results.
1 code implementation • INLG (ACL) 2020 • Craig Thomson, Ehud Reiter
Most Natural Language Generation systems need to produce accurate texts.
no code implementations • IntelLang 2020 • Craig Thomson, Ehud Reiter, Somayajulu Sripada
In this resource paper, we introduce the SportSett:Basketball database.
no code implementations • IntelLanG 2020 • Simone Balloccu, Ehud Reiter, Alexandra Johnstone, Claire Fyfe
Can stress affect not only your life but also how you read and interpret a text?
no code implementations • 22 Jun 2020 • Ehud Reiter, Craig Thomson
We propose a shared task on methodologies and algorithms for evaluating the accuracy of generated texts.
no code implementations • 29 Apr 2020 • Nirmalie Wiratunga, Kay Cooper, Anjana Wijekoon, Chamath Palihawadana, Vanessa Mendham, Ehud Reiter, Kyle Martin
For this reason, chatbots have recently been seen in healthcare delivering digital interventions through free text or choice selection.
no code implementations • WS 2019 • Ehud Reiter
Good quality explanations of artificial intelligence (XAI) reasoning must be written (and evaluated) for an explanatory purpose, targeted towards their readers, have a good narrative and causal structure, and highlight where uncertainty and data quality affect the AI output.
no code implementations • WS 2018 • Craig Thomson, Ehud Reiter, Somayajulu Sripada
This paper proposes an approach to NLG system design which focuses on generating output text which can be more easily processed by the reader.
no code implementations • WS 2018 • Kittipitch Kuptavanich, Ehud Reiter, Kees Van Deemter, Advaith Siddharthan
We explored the task of creating a textual summary describing a large set of objects characterised by a small number of features using an e-commerce dataset.
no code implementations • WS 2018 • Alejandro Ramos-Soto, Ehud Reiter, Kees Van Deemter, Jose M. Alonso, Albert Gatt
We present a data resource which can be useful for research purposes on language grounding tasks in the context of geographical referring expression generation.
no code implementations • CL 2018 • Ehud Reiter
The BLEU metric has been widely used in NLP for over 15 years to evaluate NLP systems, especially in machine translation and natural language generation.
no code implementations • 10 Aug 2018 • Steffen Pauws, Albert Gatt, Emiel Krahmer, Ehud Reiter
Financial reports are produced to assess healthcare organizations on some key performance indicators to steer their healthcare delivery.
no code implementations • WS 2017 • Ehud Reiter
I briefly describe some of the commercial work which XXX is doing in referring expression algorithms, and highlight differences between what is commercially important (at least to XXX) and the NLG research literature.
no code implementations • WS 2017 • Stephanie Inglis, Ehud Reiter, Somayajulu Sripada
Many data-to-text NLG systems work with data sets which are incomplete, ie some of the data is missing.
no code implementations • 30 Mar 2017 • Alejandro Ramos-Soto, Jose M. Alonso, Ehud Reiter, Kees Van Deemter, Albert Gatt
We present a novel heuristic approach that defines fuzzy geographical descriptors using data gathered from a survey with human subjects.
no code implementations • 23 Apr 1995 • Ehud Reiter
One of the most important questions in applied NLG is what benefits (or `value-added', in business-speak) NLG technology offers over template-based approaches.