no code implementations • ACL (WebNLG, INLG) 2020 • Zdeněk Kasner, Ondřej Dušek
We describe our system for the RDF-to-text generation task of the WebNLG Challenge 2020.
no code implementations • NAACL (WNU) 2022 • Rudolf Rosa, Patrícia Schmidtová, Ondřej Dušek, Tomáš Musil, David Mareček, Saad Obaid, Marie Nováková, Klára Vosecká, Josef Doležal
We experiment with adapting generative language models for the generation of long coherent narratives in the form of theatre plays.
1 code implementation • INLG (ACL) 2021 • Zdeněk Kasner, Simon Mille, Ondřej Dušek
Our system can detect the errors automatically using a combination of a rule-based natural language generation (NLG) system and pretrained language models (LMs).
1 code implementation • LREC 2022 • Vojtěch Hudeček, Leon-paul Schaub, Daniel Stancl, Patrick Paroubek, Ondřej Dušek
In this paper, we present a new dataset, obtained by merging four publicly available annotated corpora for task-oriented dialogues in several domains (MultiWOZ 2. 2, CamRest676, DSTC2 and Schema-Guided Dialogue Dataset).
no code implementations • 11 Apr 2025 • Zdeněk Kasner, Vilém Zouhar, Patrícia Schmidtová, Ivan Kartáč, Kristýna Onderková, Ondřej Plátek, Dimitra Gkatzia, Saad Mahamood, Ondřej Dušek, Simone Balloccu
Until recently, span annotation was limited to human annotators or fine-tuned encoder models.
1 code implementation • 14 Mar 2025 • Ivan Kartáč, Mateusz Lango, Ondřej Dušek
Large Language Models (LLMs) have demonstrated great potential as evaluators of NLG systems, allowing for high-quality, reference-free, and multi-aspect assessments.
1 code implementation • 17 Aug 2024 • Patrícia Schmidtová, Saad Mahamood, Simone Balloccu, Ondřej Dušek, Albert Gatt, Dimitra Gkatzia, David M. Howcroft, Ondřej Plátek, Adarsa Sivaprasad
Automatic metrics are extensively used to evaluate natural language processing systems.
no code implementations • 29 Jul 2024 • Jindřich Helcl, Zdeněk Kasner, Ondřej Dušek, Tomasz Limisiewicz, Dominik Macháček, Tomáš Musil, Jindřich Libovický
This paper presents teaching materials, particularly assignments and ideas for classroom activities, from a new course on large language models (LLMs) taught at Charles University.
2 code implementations • 25 Jul 2024 • Zdeněk Kasner, Ondřej Plátek, Patrícia Schmidtová, Simone Balloccu, Ondřej Dušek
We present factgenie: a framework for annotating and visualizing word spans in textual model outputs.
1 code implementation • 9 Jun 2024 • Sourabrata Mukherjee, Atul Kr. Ojha, Ondřej Dušek
We analyze the performance of large language models (LLMs) on Text Style Transfer (TST), specifically focusing on sentiment transfer and text detoxification across three languages: English, Hindi, and Bengali.
2 code implementations • 31 May 2024 • Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondřej Dušek
Text style transfer (TST) involves altering the linguistic style of a text while preserving its core content.
1 code implementation • 12 Feb 2024 • Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek
This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved.
no code implementations • 6 Feb 2024 • Simone Balloccu, Patrícia Schmidtová, Mateusz Lango, Ondřej Dušek
Natural Language Processing (NLP) research is increasingly focusing on the use of Large Language Models (LLMs), with some of the most popular ones being either fully or partially closed-source.
no code implementations • 18 Jan 2024 • Zdeněk Kasner, Ondřej Dušek
We analyze the behaviors of open large language models (LLMs) on the task of data-to-text (D2T) generation, i. e., generating coherent and relevant text from structured data.
1 code implementation • 22 Dec 2023 • Sourabrata Mukherjee, Zdeněk Kasner, Ondřej Dušek
Text sentiment transfer aims to flip the sentiment polarity of a sentence (positive to negative or vice versa) while preserving its sentiment-independent content.
1 code implementation • 15 Nov 2023 • Nalin Kumar, Ondřej Dušek
Linguistic entrainment, or alignment, represents a phenomenon where linguistic patterns employed by conversational participants converge to one another.
1 code implementation • 25 Oct 2023 • Mateusz Lango, Ondřej Dušek
Our method does not need any changes to the underlying LM's architecture or training procedure and can thus be combined with any model and decoding operating on word probabilities.
2 code implementations • 12 Aug 2023 • Ondřej Plátek, Vojtěch Hudeček, Patricia Schmidtová, Mateusz Lango, Ondřej Dušek
This paper describes the systems submitted by team6 for ChatEval, the DSTC 11 Track 4 competition.
1 code implementation • 12 Aug 2023 • Ondřej Plátek, Mateusz Lango, Ondřej Dušek
This work presents our efforts to reproduce the results of the human evaluation experiment presented in the paper of Vamvas and Sennrich (2022), which evaluated an automatic system detecting over- and undertranslations (translations containing more or less information than the original) in machine translation (MT) outputs.
1 code implementation • 1 Aug 2023 • Saad Obaid ul Islam, Iza Škrjanec, Ondřej Dušek, Vera Demberg
Hallucinations in text generation occur when the system produces text that is not grounded in the input.
no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.
no code implementations • 13 Apr 2023 • Vojtěch Hudeček, Ondřej Dušek
Instructions-tuned Large Language Models (LLMs) gained recently huge popularity thanks to their ability to interact with users through conversation.
1 code implementation • 27 Feb 2023 • Zdeněk Kasner, Ekaterina Garanina, Ondřej Plátek, Ondřej Dušek
We present TabGenie - a toolkit which enables researchers to explore, preprocess, and analyze a variety of data-to-text generation datasets through the unified framework of table-to-text generation.
1 code implementation • 17 Jan 2023 • Ondřej Plátek, Ondřej Dušek
We present MooseNet, a trainable speech metric that predicts the listeners' Mean Opinion Score (MOS).
1 code implementation • 13 Oct 2022 • Zdeněk Kasner, Ioannis Konstas, Ondřej Dušek
Pretrained language models (PLMs) for data-to-text (D2T) generation can use human-readable data labels such as column headings, keys, or relation names to generalize to out-of-domain examples.
1 code implementation • 22 Sep 2022 • Vojtěch Hudeček, Ondřej Dušek
We present a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions.
1 code implementation • SIGDIAL (ACL) 2022 • Tomáš Nekvinda, Ondřej Dušek
We introduce AARGH, an end-to-end task-oriented dialog system combining retrieval and generative approaches in a single model, aiming at improving dialog management and lexical diversity of outputs.
no code implementations • 16 Jun 2022 • Patrícia Schmidtová, Dávid Javorský, Christián Mikláš, Tomáš Musil, Rudolf Rosa, Ondřej Dušek
We present a novel approach to generating scripts by using agents with different personality types.
1 code implementation • ACL 2022 • Zdeněk Kasner, Ondřej Dušek
In data-to-text (D2T) generation, training on in-domain data leads to overfitting to the data representation and repeating training data noise.
1 code implementation • Findings (EMNLP) 2021 • Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas
We show via data analysis that it's not only the models which are to blame: more than 27% of facts mentioned in the gold summaries of MiRANews are better grounded on assisting documents than in the main source articles.
no code implementations • INLG (ACL) 2021 • Emiel van Miltenburg, Miruna-Adriana Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson, Luou Wen
We observe a severe under-reporting of the different kinds of errors that Natural Language Generation systems make.
1 code implementation • ACL (GEM) 2021 • Tomáš Nekvinda, Ondřej Dušek
The MultiWOZ dataset (Budzianowski et al., 2018) is frequently used for benchmarking context-to-response abilities of task-oriented dialogue systems.
1 code implementation • ACL 2021 • Xinnuo Xu, Ondřej Dušek, Verena Rieser, Ioannis Konstas
We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation.
no code implementations • 17 Feb 2021 • Rudolf Rosa, Tomáš Musil, Ondřej Dušek, Dominik Jurko, Patrícia Schmidtová, David Mareček, Ondřej Bojar, Tom Kocmi, Daniel Hrbek, David Košťák, Martina Kinská, Marie Nováková, Josef Doležal, Klára Vosecká, Tomáš Studeník, Petr Žabka
We present the first version of a system for interactive generation of theatre play scripts.
1 code implementation • EMNLP (NLP4ConvAI) 2021 • Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek
Our model substantially outperforms the baseline on the MultiWOZ data and shows competitive performance with state of the art in both automatic and human evaluation.
Ranked #3 on
End-To-End Dialogue Modelling
on MULTIWOZ 2.0
(using extra training data)
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on
Extreme Summarization
on GEM-XSum
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+5
1 code implementation • INLG (ACL) 2020 • Ondřej Dušek, Zdeněk Kasner
A major challenge in evaluating data-to-text (D2T) generation is measuring the semantic accuracy of the generated text, i. e. checking if the output text contains all and only facts supported by the input data.
1 code implementation • INLG (ACL) 2020 • Zdeněk Kasner, Ondřej Dušek
Our approach maximizes the completeness and semantic accuracy of the output text while leveraging the abilities of recent pre-trained models for text editing (LaserTagger) and language modeling (GPT-2) to improve the text fluency.
3 code implementations • 9 Aug 2020 • Jan Vainer, Ondřej Dušek
While recent neural sequence-to-sequence models have greatly improved the quality of speech synthesis, there has not been a system capable of fast training, fast inference and high-quality audio synthesis at the same time.
1 code implementation • 3 Aug 2020 • Tomáš Nekvinda, Ondřej Dušek
We introduce an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation and produces natural-sounding multilingual speech using more languages and less training data than previous approaches.
no code implementations • 25 Jun 2020 • Rudolf Rosa, Ondřej Dušek, Tom Kocmi, David Mareček, Tomáš Musil, Patrícia Schmidtová, Dominik Jurko, Ondřej Bojar, Daniel Hrbek, David Košťák, Martina Kinská, Josef Doležal, Klára Vosecká
We present THEaiTRE, a starting project aimed at automatic generation of theatre play scripts.
1 code implementation • WS 2019 • Ondřej Dušek, David M. Howcroft, Verena Rieser
Neural natural language generation (NNLG) systems are known for their pathological outputs, i. e. generating text which is unrelated to the input specification.
Ranked #3 on
Data-to-Text Generation
on Cleaned E2E NLG Challenge
2 code implementations • 11 Oct 2019 • Ondřej Dušek, Filip Jurčíček
We present the first dataset targeted at end-to-end NLG in Czech in the restaurant domain, along with several strong baseline models using the sequence-to-sequence approach.
1 code implementation • WS 2019 • Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, Verena Rieser
We present a recurrent neural network based system for automatic quality estimation of natural language generation (NLG) outputs, which jointly learns to assign numerical ratings to individual outputs and to provide pairwise rankings of two different outputs.
no code implementations • WS 2019 • Simon Keizer, Ondřej Dušek, Xingkun Liu, Verena Rieser
We present the first complete spoken dialogue system driven by a multi-dimensional statistical dialogue manager.
no code implementations • 23 Jan 2019 • Ondřej Dušek, Jekaterina Novikova, Verena Rieser
Introducing novel automatic and human metrics, we compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.
1 code implementation • WS 2018 • Igor Shalyminov, Ondřej Dušek, Oliver Lemon
Using a dataset of real conversations collected in the 2017 Alexa Prize challenge, we developed a neural ranker for selecting 'good' system responses to user utterances, i. e. responses which are likely to lead to long and engaging conversations.
1 code implementation • WS 2018 • Ondřej Dušek, Jekaterina Novikova, Verena Rieser
This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems.
Ranked #5 on
Data-to-Text Generation
on E2E NLG Challenge
2 code implementations • 18 Sep 2018 • Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, Verena Rieser
We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity.
1 code implementation • NAACL 2018 • Jekaterina Novikova, Ondřej Dušek, Verena Rieser
Human evaluation for natural language generation (NLG) often suffers from inconsistent user ratings.
no code implementations • 20 Dec 2017 • Ioannis Papaioannou, Amanda Cercas Curry, Jose L. Part, Igor Shalyminov, Xinnuo Xu, Yanchao Yu, Ondřej Dušek, Verena Rieser, Oliver Lemon
Open-domain social dialogue is one of the long-standing goals of Artificial Intelligence.
1 code implementation • 5 Aug 2017 • Ondřej Dušek, Jekaterina Novikova, Verena Rieser
Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output.
1 code implementation • EMNLP 2017 • Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, Verena Rieser
The majority of NLG evaluation relies on automatic metrics, such as BLEU .
no code implementations • 28 Jun 2017 • Jekaterina Novikova, Ondřej Dušek, Verena Rieser
We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora.
2 code implementations • WS 2017 • Jekaterina Novikova, Ondřej Dušek, Verena Rieser
This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area.
1 code implementation • 25 Aug 2016 • Ondřej Dušek, Filip Jurčíček
We present a novel natural language generation system for spoken dialogue systems capable of entraining (adapting) to users' way of speaking, providing contextually appropriate responses.
1 code implementation • 17 Jun 2016 • Ondřej Dušek, Filip Jurčíček
We present a natural language generator based on the sequence-to-sequence approach that can be trained to produce natural language strings as well as deep syntax dependency trees from input dialogue acts, and we use it to directly compare two-step generation with separate sentence planning and surface realization stages to a joint, one-step approach.