Search Results for author: Thibault Sellam

Found 12 papers, 3 papers with code

Reward Gaming in Conditional Text Generation

no code implementations16 Nov 2022 Richard Yuanzhe Pang, Vishakh Padmakumar, Thibault Sellam, Ankur P. Parikh, He He

To align conditional text generation model outputs with desired behaviors, there has been an increasing focus on training the model using reinforcement learning (RL) with reward functions learned from human annotations.

Conditional Text Generation Reinforcement Learning (RL)

Dialect-robust Evaluation of Generated Text

no code implementations2 Nov 2022 Jiao Sun, Thibault Sellam, Elizabeth Clark, Tu Vu, Timothy Dozat, Dan Garrette, Aditya Siddhant, Jacob Eisenstein, Sebastian Gehrmann

Evaluation metrics that are not robust to dialect variation make it impossible to tell how well systems perform for many groups of users, and can even penalize systems for producing text in lower-resource dialects.

SQuId: Measuring Speech Naturalness in Many Languages

no code implementations12 Oct 2022 Thibault Sellam, Ankur Bapna, Joshua Camp, Diana Mackinnon, Ankur P. Parikh, Jason Riesa

The main insight is that training one model on many locales consistently outperforms mono-locale baselines.

Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text

no code implementations14 Feb 2022 Sebastian Gehrmann, Elizabeth Clark, Thibault Sellam

We summarize, categorize, and discuss how researchers have been addressing these issues and what their findings mean for the current state of model evaluations.

Text Generation

Learning Compact Metrics for MT

1 code implementation EMNLP 2021 Amy Pu, Hyung Won Chung, Ankur P. Parikh, Sebastian Gehrmann, Thibault Sellam

Recent developments in machine translation and multilingual text generation have led researchers to adopt trained metrics such as COMET or BLEURT, which treat evaluation as a regression problem and use representations from multilingual pre-trained models such as XLM-RoBERTa or mBERT.

Cross-Lingual Transfer Language Modelling +4

BLEURT: Learning Robust Metrics for Text Generation

3 code implementations ACL 2020 Thibault Sellam, Dipanjan Das, Ankur P. Parikh

We propose BLEURT, a learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples.

Text Generation

A Multilingual View of Unsupervised Machine Translation

no code implementations Findings of the Association for Computational Linguistics 2020 Xavier Garcia, Pierre Foret, Thibault Sellam, Ankur P. Parikh

We present a probabilistic framework for multilingual neural machine translation that encompasses supervised and unsupervised setups, focusing on unsupervised translation.

Translation Unsupervised Machine Translation

Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation

no code implementations19 Oct 2019 Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh

We address the issue of hallucination in data-to-text generation, i. e., reducing the generation of text that is unsupported by the source.

Data-to-Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.