Search Results for author: Ondrej Bojar

Found 11 papers, 1 papers with code

CUNI Submission to MT4All Shared Task

no code implementations SIGUL (LREC) 2022 Ivana Kvapilíková, Ondrej Bojar

This paper describes our submission to the MT4All Shared Task in unsupervised machine translation from English to Ukrainian, Kazakh and Georgian in the legal domain.

Denoising Translation +1

Unveiling Multilinguality in Transformer Models: Exploring Language Specificity in Feed-Forward Networks

no code implementations24 Oct 2023 Sunit Bhattacharya, Ondrej Bojar

The values then combine the output from the 'memories' of the keys to generate predictions about the next token.

Specificity

Announcing CzEng 2.0 Parallel Corpus with over 2 Gigawords

no code implementations6 Jul 2020 Tom Kocmi, Martin Popel, Ondrej Bojar

We present a new release of the Czech-English parallel corpus CzEng 2. 0 consisting of over 2 billion words (2 "gigawords") in each language.

COSTRA 1.0: A Dataset of Complex Sentence Transformations

no code implementations LREC 2020 Petra Barancikova, Ondrej Bojar

The hope is that with this dataset, we should be able to test semantic properties of sentence embeddings and perhaps even to find some topologically interesting 'skeleton' in the sentence embedding space.

Sentence Sentence Embedding +1

Curriculum Learning and Minibatch Bucketing in Neural Machine Translation

no code implementations RANLP 2017 Tom Kocmi, Ondrej Bojar

We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT).

Machine Translation NMT +2

HUME: Human UCCA-Based Evaluation of Machine Translation

1 code implementation EMNLP 2016 Alexandra Birch, Omri Abend, Ondrej Bojar, Barry Haddow

Human evaluation of machine translation normally uses sentence-level measures such as relative ranking or adequacy scales.

Machine Translation Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.