Texts

HumanMT

Introduced by Kreutzer et al. in Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

HumanMT is a collection of human ratings and corrections of machine translations. It consists of two parts: The first part contains five-point and pairwise sentence-level ratings, the second part contains error markings and corrections. Details are described in the following.

I. Sentence-level ratings This is a collection of five-point and pairwise ratings for 1000 German-English machine translations of TED talks (IWSLT 2014). The ratings were collected with the purpose of assessing machine translation quality rating reliability and learnability to improve a neural machine translation model with human reinforcement (see publications).

II. Error markings and corrections This is a collection of word-level error markings and post-edits/corrections for 3120 English-German machine translated sentences of 30 selected TED talks (IWSLT 2017). Each sentence received either a correction or a marking of errors from human annotators. This data was collected with the purpose of comparing annotation cost and quality, and potential for downstream machine translation improvements between annotation modes (see publications).

Source: HumanMT: Human Machine Translation Ratings

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

HumanMT

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages