WMT 2016 IT (WMT 2016 IT Translation Task)

Introduced by Bojar et al. in Findings of the 2016 Conference on Machine Translation

The IT Translation Task is a shared task introduced in the First Conference on Machine Translation. Compared to WMT 2016 News, this task brought several novelties to WMT:

  • 4 out of the 7 langauges of the IT task are new in WMT,
  • adaptation to the IT domain with its specifics such as frequent named entities (mostly menu items, names of products and companies) and technical jargon,
  • adaptation to translation of answers in helpdesk service setting (many of the sentences are instructions with imperative verbs, which is very rare in the News translation task).

The test set consisted of 1000 answers from the Batch 3 of the QTLeap Corpus. The in-domain training data contained 2000 answers from the Batches 1 and 2 and also localization files from several open-source projects (LibreOffice, KDE, VLC) and bilingual dictionaries of IT-related terms extracted from Wikipedia. The out-of-domain training data contained all the corpora from the WMT 2016 News, plus PaCo2-EuEn Basque-English corpus and SETimes with Bulgarian-English parallel sentences. “Constrained” systems were restricted to use only these training data provided by the organizers.

The task was evaluated on the following language pairs:

  • English → Bulgarian
  • English → Czech
  • English → German
  • English → Spanish
  • English → Basque
  • English → Dutch
  • English → Portuguese
Source: http://www.statmt.org/wmt16/index.html


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.


Similar Datasets