no code implementations • WS 2019 • Sheila Castilho, Nat{\'a}lia Resende, Federico Gaspari, Andy Way, Tony O{'}Dowd, Marek Mazur, Manuel Herranz, Alex Helle, Gema Ram{\'\i}rez-S{\'a}nchez, V{\'\i}ctor S{\'a}nchez-Cartagena, M{\=a}rcis Pinnis, Valters {\v{S}}ics
no code implementations • WS 2018 • M{\=a}rcis Pinnis
The paper describes parallel corpus filtering methods that allow reducing noise of noisy {``}parallel{''} corpora from a level where the corpora are not usable for neural machine translation training (i. e., the resulting systems fail to achieve reasonable translation quality; well below 10 BLEU points) up to a level where the trained systems show decent (over 20 BLEU points on a 10 million word dataset and up to 30 BLEU points on a 100 million word dataset).
1 code implementation • WS 2018 • M{\=a}rcis Pinnis, Mat{\=\i}ss Rikters, Rihards Kri{\v{s}}lauks
For the WMT 2018 shared task, we submitted seven systems (both constrained and unconstrained) for English-Estonian and Estonian-English translation directions.
Ranked #1 on Machine Translation on WMT 2018 Estonian-English
1 code implementation • LREC 2018 • Mat{\=\i}ss Rikters, M{\=a}rcis Pinnis, Rihards Kri{\v{s}}lauks
no code implementations • IJCNLP 2017 • Inguna Skadi{\c{n}}a, M{\=a}rcis Pinnis
The recent technological shift in machine translation from statistical machine translation (SMT) to neural machine translation (NMT) raises the question of the strengths and weaknesses of NMT.
no code implementations • WS 2017 • Jan-Thorsten Peter, Hermann Ney, Ond{\v{r}}ej Bojar, Ngoc-Quan Pham, Jan Niehues, Alex Waibel, Franck Burlot, Fran{\c{c}}ois Yvon, M{\=a}rcis Pinnis, Valters {\v{S}}ics, Jasmijn Bastings, Miguel Rios, Wilker Aziz, Philip Williams, Fr{\'e}d{\'e}ric Blain, Lucia Specia
no code implementations • WS 2016 • Jan-Thorsten Peter, Tamer Alkhouli, Hermann Ney, Matthias Huck, Fabienne Braune, Alex Fraser, er, Ale{\v{s}} Tamchyna, Ond{\v{r}}ej Bojar, Barry Haddow, Rico Sennrich, Fr{\'e}d{\'e}ric Blain, Lucia Specia, Jan Niehues, Alex Waibel, Alex Allauzen, re, Lauriane Aufrant, Franck Burlot, Elena Knyazeva, Thomas Lavergne, Fran{\c{c}}ois Yvon, M{\=a}rcis Pinnis, Stella Frank
Ranked #12 on Machine Translation on WMT2016 English-Romanian
no code implementations • LREC 2016 • M{\=a}rcis Pinnis, Askars Salimbajevs, Ilze Auzi{\c{n}}a
In this paper the authors present a speech corpus designed and created for the development and evaluation of dictation systems in Latvian.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • LREC 2014 • M{\=a}rcis Pinnis, Ilze Auzi{\c{n}}a, K{\=a}rlis Goba
In this paper the authors present the first Latvian speech corpus designed specifically for speech recognition purposes.
no code implementations • LREC 2014 • Juris Borzovs, Ilze Ilzi{\c{n}}a, Iveta Kei{\v{s}}a, M{\=a}rcis Pinnis, Andrejs Vasi{\c{l}}jevs
Analysis of the terms proves that, in general, in the normative terminology work in Latvia localized terms are coined according to these guidelines.
1 code implementation • LREC 2014 • Ahmet Aker, Monica Paramita, M{\=a}rcis Pinnis, Robert Gaizauskas
In this work we present three different methods for cleaning noise from automatically generated bilingual dictionaries: LLR, pivot and translation based approach.
no code implementations • LREC 2012 • Inguna Skadi{\c{n}}a, Ahmet Aker, Nikos Mastropavlos, Fangzhong Su, Dan Tufis, Mateja Verlic, Andrejs Vasi{\c{l}}jevs, Bogdan Babych, Paul Clough, Robert Gaizauskas, Nikos Glaros, Monica Lestari Paramita, M{\=a}rcis Pinnis
Lack of sufficient parallel data for many languages and domains is currently one of the major obstacles to further advancement of automated translation.
no code implementations • LREC 2012 • M{\=a}rcis Pinnis
It also gives evaluation on human annotated gold standard test corpora for Latvian and Lithuanian languages as well as comparative performance analysis to a state-of-the art English named entity recognition system using parallel and strongly comparable corpora.