Search Results for author: Verena Rieser

Found 75 papers, 25 papers with code

Conversational Assistants and Gender Stereotypes: Public Perceptions and Desiderata for Voice Personas

no code implementations GeBNLP (COLING) 2020 Amanda Cercas Curry, Judy Robertson, Verena Rieser

We then outline a multi-disciplinary project of how we plan to address the complex question of gender and stereotyping in digital assistants.

Multiple-choice

Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

no code implementations INLG (ACL) 2020 David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser

Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches and a proliferation of different quality criteria used by researchers make it difficult to compare results and draw conclusions across papers, with adverse implications for meta-evaluation and reproducibility.

Experimental Design

Guiding the Release of Safer E2E Conversational AI through Value Sensitive Design

no code implementations SIGDIAL (ACL) 2022 A. Stevie Bergman, Gavin Abercrombie, Shannon Spruit, Dirk Hovy, Emily Dinan, Y-Lan Boureau, Verena Rieser

Over the last several years, end-to-end neural conversational agents have vastly improved their ability to carry unrestricted, open-domain conversations with humans.

ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Detection in Conversational AI

no code implementations EMNLP 2021 Amanda Cercas Curry, Gavin Abercrombie, Verena Rieser

We find that the distribution of abuse is vastly different compared to other commonly used datasets, with more sexually tinted aggression towards the virtual persona of these systems.

Abusive Language Chatbot

NLP Verification: Towards a General Methodology for Certifying Robustness

no code implementations15 Mar 2024 Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Omri Isac, Matthew L. Daggitt, Guy Katz, Verena Rieser, Oliver Lemon

We propose a number of practical NLP methods that can help to identify the effects of the embedding gap; and in particular we propose the metric of falsifiability of semantic subpspaces as another fundamental metric to be reported as part of the NLP verification pipeline.

Multitask Multimodal Prompted Training for Interactive Embodied Task Completion

no code implementations7 Nov 2023 Georgios Pantazopoulos, Malvina Nikandrou, Amit Parekh, Bhathiya Hemanthage, Arash Eshghi, Ioannis Konstas, Verena Rieser, Oliver Lemon, Alessandro Suglia

Interactive and embodied tasks pose at least two fundamental challenges to existing Vision & Language (VL) models, including 1) grounding language in trajectories of actions and observations, and 2) referential disambiguation.

Text Generation

FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions

no code implementations29 Aug 2023 Neeraj Cherakara, Finny Varghese, Sheena Shabana, Nivan Nelson, Abhiram Karukayil, Rohith Kulothungan, Mohammed Afil Farhan, Birthe Nesset, Meriam Moujahid, Tanvi Dinkar, Verena Rieser, Oliver Lemon

We demonstrate an embodied conversational agent that can function as a receptionist and generate a mixture of open and closed-domain dialogue along with facial expressions, by using a large language model (LLM) to develop an engaging conversation.

Language Modelling Large Language Model +1

Understanding Counterspeech for Online Harm Mitigation

no code implementations1 Jul 2023 Yi-Ling Chung, Gavin Abercrombie, Florence Enock, Jonathan Bright, Verena Rieser

Counterspeech offers direct rebuttals to hateful speech by challenging perpetrators of hate and showing support to targets of abuse.

Mirages: On Anthropomorphism in Dialogue Systems

no code implementations16 May 2023 Gavin Abercrombie, Amanda Cercas Curry, Tanvi Dinkar, Verena Rieser, Zeerak Talat

In this paper, we discuss the linguistic factors that contribute to the anthropomorphism of dialogue systems and the harms that can arise, including reinforcing gender stereotypes and notions of acceptable language.

iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives?

no code implementations10 May 2023 Nikolas Vitsakis, Amit Parekh, Tanvi Dinkar, Gavin Abercrombie, Ioannis Konstas, Verena Rieser

There are two competing approaches for modelling annotator disagreement: distributional soft-labelling approaches (which aim to capture the level of disagreement) or modelling perspectives of individual annotators or groups thereof.

ANTONIO: Towards a Systematic Method of Generating NLP Benchmarks for Verification

no code implementations6 May 2023 Marco Casadio, Luca Arnaboldi, Matthew L. Daggitt, Omri Isac, Tanvi Dinkar, Daniel Kienitz, Verena Rieser, Ekaterina Komendantskaya

In particular, many known neural network verification methods that work for computer vision and other numeric datasets do not work for NLP.

SemEval-2023 Task 11: Learning With Disagreements (LeWiDi)

no code implementations28 Apr 2023 Elisa Leonardelli, Alexandra Uma, Gavin Abercrombie, Dina Almanea, Valerio Basile, Tommaso Fornaciari, Barbara Plank, Verena Rieser, Massimo Poesio

We report on the second LeWiDi shared task, which differs from the first edition in three crucial respects: (i) it focuses entirely on NLP, instead of both NLP and computer vision tasks in its first edition; (ii) it focuses on subjective tasks, instead of covering different types of disagreements-as training with aggregated labels for subjective NLP tasks is a particularly obvious misrepresentation of the data; and (iii) for the evaluation, we concentrate on soft approaches to evaluation.

Sentiment Analysis

Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment

no code implementations28 Apr 2023 Lu Yu, Malvina Nikandrou, Jiali Jin, Verena Rieser

In this paper, we propose a quality-agnostic framework to improve the performance and robustness of image captioning models for visually impaired people.

Data Augmentation Image Captioning

Consistency is Key: Disentangling Label Variation in Natural Language Processing with Intra-Annotator Agreement

no code implementations25 Jan 2023 Gavin Abercrombie, Verena Rieser, Dirk Hovy

We commonly use agreement measures to assess the utility of judgements made by human annotators in Natural Language Processing (NLP) tasks.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

5 code implementations9 Nov 2022 BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.

Language Modelling Multilingual NLP

Going for GOAL: A Resource for Grounded Football Commentaries

1 code implementation8 Nov 2022 Alessandro Suglia, José Lopes, Emanuele Bastianelli, Andrea Vanzo, Shubham Agarwal, Malvina Nikandrou, Lu Yu, Ioannis Konstas, Verena Rieser

As the course of a game is unpredictable, so are commentaries, which makes them a unique resource to investigate dynamic language grounding.

Moment Retrieval Retrieval

Risk-graded Safety for Handling Medical Queries in Conversational AI

1 code implementation2 Oct 2022 Gavin Abercrombie, Verena Rieser

While individual crowdworkers may be unreliable at grading the seriousness of the prompts, their aggregated labels tend to agree with professional opinion to a greater extent on identifying the medical queries and recognising the risk types posed by the responses.

Adversarial Robustness of Visual Dialog

no code implementations6 Jul 2022 Lu Yu, Verena Rieser

This study is the first to investigate the robustness of visually grounded dialog models towards textual attacks.

Adversarial Robustness Visual Dialog

Why Robust Natural Language Understanding is a Challenge

no code implementations21 Jun 2022 Marco Casadio, Ekaterina Komendantskaya, Verena Rieser, Matthew L. Daggitt, Daniel Kienitz, Luca Arnaboldi, Wen Kokke

With the proliferation of Deep Machine Learning into real-life applications, a particular property of this technology has been brought to attention: robustness Neural Networks notoriously present low robustness and can be highly sensitive to small input perturbations.

Natural Language Understanding

MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization

1 code implementation Findings (EMNLP) 2021 Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas

We show via data analysis that it's not only the models which are to blame: more than 27% of facts mentioned in the gold summaries of MiRANews are better grounded on assisting documents than in the main source articles.

Document Summarization Multi-Document Summarization +2

ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI

1 code implementation20 Sep 2021 Amanda Cercas Curry, Gavin Abercrombie, Verena Rieser

We find that the distribution of abuse is vastly different compared to other commonly used datasets, with more sexually tinted aggression towards the virtual persona of these systems.

Abuse Detection Abusive Language +1

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

no code implementations7 Jul 2021 Emily Dinan, Gavin Abercrombie, A. Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, Verena Rieser

Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans.

AGGGEN: Ordering and Aggregating while Generating

1 code implementation ACL 2021 Xinnuo Xu, Ondřej Dušek, Verena Rieser, Ioannis Konstas

We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation.

Sentence

SLURP: A Spoken Language Understanding Resource Package

1 code implementation EMNLP 2020 Emanuele Bastianelli, Andrea Vanzo, Pawel Swietojanski, Verena Rieser

Spoken Language Understanding infers semantic meaning directly from audio data, and thus promises to reduce error propagation and misunderstandings in end-user applications.

Ranked #3 on Slot Filling on SLURP (using extra training data)

Intent Classification Slot Filling +1

Fact-based Content Weighting for Evaluating Abstractive Summarisation

no code implementations ACL 2020 Xinnuo Xu, Ond{\v{r}}ej Du{\v{s}}ek, Jingyi Li, Verena Rieser, Ioannis Konstas

Abstractive summarisation is notoriously hard to evaluate since standard word-overlap-based metrics are insufficient.

History for Visual Dialog: Do we really need it?

2 code implementations ACL 2020 Shubham Agarwal, Trung Bui, Joon-Young Lee, Ioannis Konstas, Verena Rieser

Visual Dialog involves "understanding" the dialog history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to generate the correct response.

Visual Dialog

Semantic Noise Matters for Neural Natural Language Generation

1 code implementation WS 2019 Ondřej Dušek, David M. Howcroft, Verena Rieser

Neural natural language generation (NNLG) systems are known for their pathological outputs, i. e. generating text which is unrelated to the input specification.

Data-to-Text Generation Hallucination

Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking)

1 code implementation WS 2019 Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, Verena Rieser

We present a recurrent neural network based system for automatic quality estimation of natural language generation (NLG) outputs, which jointly learns to assign numerical ratings to individual outputs and to provide pairwise rankings of two different outputs.

Learning-To-Rank Text Generation

User Evaluation of a Multi-dimensional Statistical Dialogue System

1 code implementation WS 2019 Simon Keizer, Ondřej Dušek, Xingkun Liu, Verena Rieser

We present the first complete spoken dialogue system driven by a multi-dimensional statistical dialogue manager.

Benchmarking Natural Language Understanding Services for building Conversational Agents

8 code implementations13 Mar 2019 Xingkun Liu, Arash Eshghi, Pawel Swietojanski, Verena Rieser

We have recently seen the emergence of several publicly available Natural Language Understanding (NLU) toolkits, which map user utterances to structured, but more abstract, Dialogue Act (DA) or Intent specifications, while making this process accessible to the lay developer.

Benchmarking General Classification +3

Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge

no code implementations23 Jan 2019 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

Introducing novel automatic and human metrics, we compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.

Text Generation

A Knowledge-Grounded Multimodal Search-Based Conversational Agent

1 code implementation WS 2018 Shubham Agarwal, Ondrej Dusek, Ioannis Konstas, Verena Rieser

Multimodal search-based dialogue is a challenging new task: It extends visually grounded question answering systems into multi-turn conversations with access to an external database.

Question Answering Response Generation

Findings of the E2E NLG Challenge

1 code implementation WS 2018 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems.

Data-to-Text Generation Spoken Dialogue Systems

Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity

1 code implementation EMNLP 2018 Xinnuo Xu, Ond{\v{r}}ej Du{\v{s}}ek, Ioannis Konstas, Verena Rieser

We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity.

Dialogue Generation

Better Conversations by Modeling,Filtering,and Optimizing for Coherence and Diversity

2 code implementations18 Sep 2018 Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, Verena Rieser

We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity.

\#MeToo Alexa: How Conversational Systems Respond to Sexual Harassment

no code implementations WS 2018 Am Cercas Curry, a, Verena Rieser

In this article, we establish how current state-of-the-art conversational systems react to inappropriate requests, such as bullying and sexual harassment on the part of the user, by collecting and analysing the novel {\#}MeTooAlexa corpus.

A Review of Evaluation Techniques for Social Dialogue Systems

no code implementations13 Sep 2017 Amanda Cercas Curry, Helen Hastie, Verena Rieser

In contrast with goal-oriented dialogue, social dialogue has no clear measure of task success.

valid

Referenceless Quality Estimation for Natural Language Generation

1 code implementation5 Aug 2017 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output.

Text Generation

Data-driven Natural Language Generation: Paving the Road to Success

no code implementations28 Jun 2017 Jekaterina Novikova, Ondřej Dušek, Verena Rieser

We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora.

BIG-bench Machine Learning Text Generation

The E2E Dataset: New Challenges For End-to-End Generation

2 code implementations WS 2017 Jekaterina Novikova, Ondřej Dušek, Verena Rieser

This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area.

Data-to-Text Generation

Crowd-sourcing NLG Data: Pictures Elicit Better Data

no code implementations1 Aug 2016 Jekaterina Novikova, Oliver Lemon, Verena Rieser

Recent advances in corpus-based Natural Language Generation (NLG) hold the promise of being easily portable across domains, but require costly training data, consisting of meaning representations (MRs) paired with Natural Language (NL) utterances.

Text Generation

Natural Language Generation as Planning under Uncertainty Using Reinforcement Learning

no code implementations15 Jun 2016 Verena Rieser, Oliver Lemon

We present and evaluate a new model for Natural Language Generation (NLG) in Spoken Dialogue Systems, based on statistical planning, given noisy feedback from the current generation context (e. g. a user and a surface realiser).

reinforcement-learning Reinforcement Learning (RL) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.