Search Results for author: M{\=a}rcis Pinnis

Found 19 papers, 5 papers with code

Tilde's Parallel Corpus Filtering Methods for WMT 2018

no code implementations WS 2018 M{\=a}rcis Pinnis

The paper describes parallel corpus filtering methods that allow reducing noise of noisy {``}parallel{''} corpora from a level where the corpora are not usable for neural machine translation training (i. e., the resulting systems fail to achieve reasonable translation quality; well below 10 BLEU points) up to a level where the trained systems show decent (over 20 BLEU points on a 10 million word dataset and up to 30 BLEU points on a 100 million word dataset).

Translation Transliteration +1

Tilde's Machine Translation Systems for WMT 2018

1 code implementation WS 2018 M{\=a}rcis Pinnis, Mat{\=\i}ss Rikters, Rihards Kri{\v{s}}lauks

For the WMT 2018 shared task, we submitted seven systems (both constrained and unconstrained) for English-Estonian and Estonian-English translation directions.

Machine Translation NMT +1

NMT or SMT: Case Study of a Narrow-domain English-Latvian Post-editing Project

no code implementations IJCNLP 2017 Inguna Skadi{\c{n}}a, M{\=a}rcis Pinnis

The recent technological shift in machine translation from statistical machine translation (SMT) to neural machine translation (NMT) raises the question of the strengths and weaknesses of NMT.

Machine Translation NMT +1

Designing the Latvian Speech Recognition Corpus

no code implementations LREC 2014 M{\=a}rcis Pinnis, Ilze Auzi{\c{n}}a, K{\=a}rlis Goba

In this paper the authors present the first Latvian speech corpus designed specifically for speech recognition purposes.

speech-recognition Speech Recognition +1

Terminology localization guidelines for the national scenario

no code implementations LREC 2014 Juris Borzovs, Ilze Ilzi{\c{n}}a, Iveta Kei{\v{s}}a, M{\=a}rcis Pinnis, Andrejs Vasi{\c{l}}jevs

Analysis of the terms proves that, in general, in the normative terminology work in Latvia localized terms are coined according to these guidelines.

Lexical Analysis

Bilingual dictionaries for all EU languages

1 code implementation LREC 2014 Ahmet Aker, Monica Paramita, M{\=a}rcis Pinnis, Robert Gaizauskas

In this work we present three different methods for cleaning noise from automatically generated bilingual dictionaries: LLR, pivot and translation based approach.

Translation Transliteration

Latvian and Lithuanian Named Entity Recognition with TildeNER

no code implementations LREC 2012 M{\=a}rcis Pinnis

It also gives evaluation on human annotated gold standard test corpora for Latvian and Lithuanian languages as well as comparative performance analysis to a state-of-the art English named entity recognition system using parallel and strongly comparable corpora.

Machine Translation named-entity-recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.