Search Results for author: Thiago castro Ferreira

Found 32 papers, 10 papers with code

Enriching the E2E dataset

1 code implementation INLG (ACL) 2021 Thiago castro Ferreira, Helena Vaz, Brian Davis, Adriana Pagano

This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG.

Referring Expression Referring expression generation

Generating Questions from Wikidata Triples

no code implementations LREC 2022 Kelvin Han, Thiago castro Ferreira, Claire Gardent

Question generation from knowledge bases (or knowledge base question generation, KBQG) is the task of generating questions from structured database information, typically in the form of triples representing facts.

Knowledge Base Question Answering Question Generation +1

MTLens: Machine Translation Output Debugging

no code implementations LREC 2022 Shreyas Sharma, Kareem Darwish, Lucas Pavanelli, Thiago castro Ferreira, Mohamed Al-Badrashiny, Kamer Ali Yuksel, Hassan Sawaf

The performance of Machine Translation (MT) systems varies significantly with inputs of diverging features such as topics, genres, and surface properties.

Benchmarking Machine Translation +2

The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results

1 code implementation MSR (COLING) 2020 Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner

As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.

A Systematic Review of Data-to-Text NLG

no code implementations13 Feb 2024 Chinonso Cynthia Osuji, Thiago castro Ferreira, Brian Davis

Relevant literature in this field on datasets, evaluation metrics, application areas, multilingualism, language models, and hallucination mitigation methods is reviewed.

Data-to-Text Generation Hallucination +1

Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model

no code implementations14 Jul 2022 Chris van der Lee, Thiago castro Ferreira, Chris Emmery, Travis Wiltshire, Emiel Krahmer

In terms of output quality, extending the training set of a data-to-text system with a language model using the pseudo-labeling approach did increase text quality scores, but the data augmentation approach yielded similar scores to the system without training set extension.

Data Augmentation Data-to-Text Generation +2

Surface Realization Shared Task 2019 (MSR19): The Team 6 Approach

no code implementations WS 2019 Thiago Castro Ferreira, Emiel Krahmer

This study describes the approach developed by the Tilburg University team to the shallow track of the Multilingual Surface Realization Shared Task 2019 (SR{'}19) (Mille et al., 2019).

Machine Translation Translation

Neural data-to-text generation: A comparison between pipeline and end-to-end architectures

1 code implementation IJCNLP 2019 Thiago Castro Ferreira, Chris van der Lee, Emiel van Miltenburg, Emiel Krahmer

In contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in-between.

Data-to-Text Generation Decoder

Enriching the WebNLG corpus

1 code implementation WS 2018 Thiago Castro Ferreira, Diego Moussallem, Emiel Krahmer, S Wubben, er

This paper describes the enrichment of WebNLG corpus (Gardent et al., 2017a, b), with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation.

Machine Translation Referring Expression +3

Surface Realization Shared Task 2018 (SR18): The Tilburg University Approach

1 code implementation WS 2018 Thiago Castro Ferreira, S Wubben, er, Emiel Krahmer

This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18).

Machine Translation Translation

NeuralREG: An end-to-end approach to referring expression generation

1 code implementation ACL 2018 Thiago Castro Ferreira, Diego Moussallem, Ákos Kádár, Sander Wubben, Emiel Krahmer

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function.

Referring Expression Referring expression generation

RDF2PT: Generating Brazilian Portuguese Texts from RDF Data

1 code implementation LREC 2018 Diego Moussallem, Thiago castro Ferreira, Marcos Zampieri, Maria Claudia Cavalcanti, Geraldo Xexéo, Mariana Neves, Axel-Cyrille Ngonga Ngomo

The generation of natural language from Resource Description Framework (RDF) data has recently gained significant attention due to the continuous growth of Linked Data.

Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation

no code implementations WS 2017 Thiago Castro Ferreira, Iacer Calixto, S Wubben, er, Emiel Krahmer

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT).

AMR-to-Text Generation Machine Translation +2

Improving the generation of personalised descriptions

no code implementations WS 2017 Thiago Castro Ferreira, Iv Paraboni, r{\'e}

Referring expression generation (REG) models that use speaker-dependent information require a considerable amount of training data produced by every individual speaker, or may otherwise perform poorly.

Referring Expression Referring expression generation +1

Trainable Referring Expression Generation using Overspecification Preferences

no code implementations12 Apr 2017 Thiago castro Ferreira, Ivandre Paraboni

Referring expression generation (REG) models that use speaker-dependent information require a considerable amount of training data produced by every individual speaker, or may otherwise perform poorly.

Referring Expression Referring expression generation

Generating flexible proper name references in text: Data, models and evaluation

no code implementations EACL 2017 Thiago Castro Ferreira, Emiel Krahmer, S Wubben, er

The model relies on the REGnames corpus, a dataset with 53, 102 proper name references to 1, 000 people in different discourse contexts.

Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.