Search Results for author: Ondrej Bojar

Found 9 papers, 1 papers with code

Announcing CzEng 2.0 Parallel Corpus with over 2 Gigawords

no code implementations6 Jul 2020 Tom Kocmi, Martin Popel, Ondrej Bojar

We present a new release of the Czech-English parallel corpus CzEng 2. 0 consisting of over 2 billion words (2 "gigawords") in each language.

COSTRA 1.0: A Dataset of Complex Sentence Transformations

no code implementations LREC 2020 Petra Barancikova, Ondrej Bojar

The hope is that with this dataset, we should be able to test semantic properties of sentence embeddings and perhaps even to find some topologically interesting 'skeleton' in the sentence embedding space.

Sentence Embedding Sentence-Embedding

Curriculum Learning and Minibatch Bucketing in Neural Machine Translation

no code implementations RANLP 2017 Tom Kocmi, Ondrej Bojar

We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT).

Machine Translation Translation

HUME: Human UCCA-Based Evaluation of Machine Translation

1 code implementation EMNLP 2016 Alexandra Birch, Omri Abend, Ondrej Bojar, Barry Haddow

Human evaluation of machine translation normally uses sentence-level measures such as relative ranking or adequacy scales.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.