Search Results for author: Ondřej Dušek

Found 41 papers, 28 papers with code

Text-in-Context: Token-Level Error Detection for Table-to-Text Generation

1 code implementation INLG (ACL) 2021 Zdeněk Kasner, Simon Mille, Ondřej Dušek

Our system can detect the errors automatically using a combination of a rule-based natural language generation (NLG) system and pretrained language models (LMs).

Language Modelling Semantic Similarity +2

A Unifying View On Task-oriented Dialogue Annotation

1 code implementation LREC 2022 Vojtěch Hudeček, Leon-paul Schaub, Daniel Stancl, Patrick Paroubek, Ondřej Dušek

In this paper, we present a new dataset, obtained by merging four publicly available annotated corpora for task-oriented dialogues in several domains (MultiWOZ 2. 2, CamRest676, DSTC2 and Schema-Guided Dialogue Dataset).

Dialogue Generation Dialogue State Tracking +1

Are LLMs All You Need for Task-Oriented Dialogue?

no code implementations13 Apr 2023 Vojtěch Hudeček, Ondřej Dušek

Instructions-tuned Large Language Models (LLMs) gained recently huge popularity thanks to their ability to interact with users through conversation.

TabGenie: A Toolkit for Table-to-Text Generation

1 code implementation27 Feb 2023 Zdeněk Kasner, Ekaterina Garanina, Ondřej Plátek, Ondřej Dušek

We present TabGenie - a toolkit which enables researchers to explore, preprocess, and analyze a variety of data-to-text generation datasets through the unified framework of table-to-text generation.

Data-to-Text Generation Table-to-Text Generation

Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models

1 code implementation13 Oct 2022 Zdeněk Kasner, Ioannis Konstas, Ondřej Dušek

Pretrained language models (PLMs) for data-to-text (D2T) generation can use human-readable data labels such as column headings, keys, or relation names to generalize to out-of-domain examples.

Knowledge Graphs

Learning Interpretable Latent Dialogue Actions With Less Supervision

1 code implementation22 Sep 2022 Vojtěch Hudeček, Ondřej Dušek

We present a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions.

AARGH! End-to-end Retrieval-Generation for Task-Oriented Dialog

1 code implementation SIGDIAL (ACL) 2022 Tomáš Nekvinda, Ondřej Dušek

We introduce AARGH, an end-to-end task-oriented dialog system combining retrieval and generative approaches in a single model, aiming at improving dialog management and lexical diversity of outputs.

Management Response Generation +1

Neural Pipeline for Zero-Shot Data-to-Text Generation

1 code implementation ACL 2022 Zdeněk Kasner, Ondřej Dušek

In data-to-text (D2T) generation, training on in-domain data leads to overfitting to the data representation and repeating training data noise.

Data-to-Text Generation

MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization

1 code implementation Findings (EMNLP) 2021 Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas

We show via data analysis that it's not only the models which are to blame: more than 27% of facts mentioned in the gold summaries of MiRANews are better grounded on assisting documents than in the main source articles.

Document Summarization Multi-Document Summarization +1

Shades of BLEU, Flavours of Success: The Case of MultiWOZ

1 code implementation ACL (GEM) 2021 Tomáš Nekvinda, Ondřej Dušek

The MultiWOZ dataset (Budzianowski et al., 2018) is frequently used for benchmarking context-to-response abilities of task-oriented dialogue systems.

Benchmarking Task-Oriented Dialogue Systems

AGGGEN: Ordering and Aggregating while Generating

1 code implementation ACL 2021 Xinnuo Xu, Ondřej Dušek, Verena Rieser, Ioannis Konstas

We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation.

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

1 code implementation EMNLP (NLP4ConvAI) 2021 Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek

Our model substantially outperforms the baseline on the MultiWOZ data and shows competitive performance with state of the art in both automatic and human evaluation.

Ranked #3 on End-To-End Dialogue Modelling on MULTIWOZ 2.0 (using extra training data)

End-To-End Dialogue Modelling Translation

Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference

1 code implementation INLG (ACL) 2020 Ondřej Dušek, Zdeněk Kasner

A major challenge in evaluating data-to-text (D2T) generation is measuring the semantic accuracy of the generated text, i. e. checking if the output text contains all and only facts supported by the input data.

Data-to-Text Generation Natural Language Inference

Data-to-Text Generation with Iterative Text Editing

1 code implementation INLG (ACL) 2020 Zdeněk Kasner, Ondřej Dušek

Our approach maximizes the completeness and semantic accuracy of the output text while leveraging the abilities of recent pre-trained models for text editing (LaserTagger) and language modeling (GPT-2) to improve the text fluency.

Data-to-Text Generation Domain Adaptation +2

SpeedySpeech: Efficient Neural Speech Synthesis

3 code implementations9 Aug 2020 Jan Vainer, Ondřej Dušek

While recent neural sequence-to-sequence models have greatly improved the quality of speech synthesis, there has not been a system capable of fast training, fast inference and high-quality audio synthesis at the same time.

Speech Synthesis

One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech

1 code implementation3 Aug 2020 Tomáš Nekvinda, Ondřej Dušek

We introduce an approach to multilingual speech synthesis which uses the meta-learning concept of contextual parameter generation and produces natural-sounding multilingual speech using more languages and less training data than previous approaches.

Meta-Learning Speech Synthesis +1

Semantic Noise Matters for Neural Natural Language Generation

1 code implementation WS 2019 Ondřej Dušek, David M. Howcroft, Verena Rieser

Neural natural language generation (NNLG) systems are known for their pathological outputs, i. e. generating text which is unrelated to the input specification.

Data-to-Text Generation

Neural Generation for Czech: Data and Baselines

2 code implementations11 Oct 2019 Ondřej Dušek, Filip Jurčíček

We present the first dataset targeted at end-to-end NLG in Czech in the restaurant domain, along with several strong baseline models using the sequence-to-sequence approach.

Language Modelling

Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking)

1 code implementation WS 2019 Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, Verena Rieser

We present a recurrent neural network based system for automatic quality estimation of natural language generation (NLG) outputs, which jointly learns to assign numerical ratings to individual outputs and to provide pairwise rankings of two different outputs.

Learning-To-Rank Text Generation

User Evaluation of a Multi-dimensional Statistical Dialogue System

1 code implementation WS 2019 Simon Keizer, Ondřej Dušek, Xingkun Liu, Verena Rieser

We present the first complete spoken dialogue system driven by a multi-dimensional statistical dialogue manager.

Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge

no code implementations23 Jan 2019 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

Introducing novel automatic and human metrics, we compare 62 systems submitted by 17 institutions, covering a wide range of approaches, including machine learning architectures -- with the majority implementing sequence-to-sequence models (seq2seq) -- as well as systems based on grammatical rules and templates.

Text Generation

Neural Response Ranking for Social Conversation: A Data-Efficient Approach

1 code implementation WS 2018 Igor Shalyminov, Ondřej Dušek, Oliver Lemon

Using a dataset of real conversations collected in the 2017 Alexa Prize challenge, we developed a neural ranker for selecting 'good' system responses to user utterances, i. e. responses which are likely to lead to long and engaging conversations.

Findings of the E2E NLG Challenge

1 code implementation WS 2018 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems.

Data-to-Text Generation Spoken Dialogue Systems

Better Conversations by Modeling,Filtering,and Optimizing for Coherence and Diversity

2 code implementations18 Sep 2018 Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, Verena Rieser

We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity.

Referenceless Quality Estimation for Natural Language Generation

1 code implementation5 Aug 2017 Ondřej Dušek, Jekaterina Novikova, Verena Rieser

Traditional automatic evaluation measures for natural language generation (NLG) use costly human-authored references to estimate the quality of a system output.

Text Generation

The E2E Dataset: New Challenges For End-to-End Generation

2 code implementations WS 2017 Jekaterina Novikova, Ondřej Dušek, Verena Rieser

This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area.

Data-to-Text Generation

Data-driven Natural Language Generation: Paving the Road to Success

no code implementations28 Jun 2017 Jekaterina Novikova, Ondřej Dušek, Verena Rieser

We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora.

BIG-bench Machine Learning Text Generation

A Context-aware Natural Language Generator for Dialogue Systems

1 code implementation25 Aug 2016 Ondřej Dušek, Filip Jurčíček

We present a novel natural language generation system for spoken dialogue systems capable of entraining (adapting) to users' way of speaking, providing contextually appropriate responses.

Spoken Dialogue Systems Text Generation

Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings

1 code implementation17 Jun 2016 Ondřej Dušek, Filip Jurčíček

We present a natural language generator based on the sequence-to-sequence approach that can be trained to produce natural language strings as well as deep syntax dependency trees from input dialogue acts, and we use it to directly compare two-step generation with separate sentence planning and surface realization stages to a joint, one-step approach.

Cannot find the paper you are looking for? You can Submit a new open access paper.