no code implementations • SIGUL (LREC) 2022 • Ivana Kvapilíková, Ondrej Bojar
This paper describes our submission to the MT4All Shared Task in unsupervised machine translation from English to Ukrainian, Kazakh and Georgian in the legal domain.
no code implementations • 24 Oct 2023 • Sunit Bhattacharya, Ondrej Bojar
The values then combine the output from the 'memories' of the keys to generate predictions about the next token.
no code implementations • CMCL (ACL) 2022 • Sunit Bhattacharya, Rishu Kumar, Ondrej Bojar
Our submissions achieved an average MAE of 5. 72 and ranked 5th in the shared task.
no code implementations • 6 Jul 2020 • Tom Kocmi, Martin Popel, Ondrej Bojar
We present a new release of the Czech-English parallel corpus CzEng 2. 0 consisting of over 2 billion words (2 "gigawords") in each language.
no code implementations • LREC 2020 • Jonas Kratochvil, Peter Polak, Ondrej Bojar
We present a large corpus of Czech parliament plenary sessions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • Petra Barancikova, Ondrej Bojar
The hope is that with this dataset, we should be able to test semantic properties of sentence embeddings and perhaps even to find some topologically interesting 'skeleton' in the sentence embedding space.
no code implementations • RANLP 2017 • Tom Kocmi, Ondrej Bojar
We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT).
1 code implementation • EMNLP 2016 • Alexandra Birch, Omri Abend, Ondrej Bojar, Barry Haddow
Human evaluation of machine translation normally uses sentence-level measures such as relative ranking or adequacy scales.
no code implementations • WS 2014 • Ondrej Bojar, Christian Buck, Christian Federmann, Barry Haddow, Philipp Koehn, Johannes Leveling, Christof Monz, Pavel Pecina, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia, Aleš Tamchyna