no code implementations • ACL 2022 • Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, Mark Fishel
Livonian is one of the most endangered languages in Europe with just a tiny handful of speakers and virtually no publicly available corpora.
no code implementations • AACL (WAT) 2020 • Matīss Rikters, Toshiaki Nakazawa, Ryokan Ri
The paper describes the development process of the The University of Tokyo’s NMT systems that were submitted to the WAT 2020 Document-level Business Scene Dialogue Translation sub-task.
no code implementations • 11 Apr 2023 • Maija Kāle, Matīss Rikters
Food choice is a complex phenomenon shaped by factors such as taste, ambience, culture or weather.
no code implementations • 4 Oct 2022 • Matīss Rikters, Sanita Reinsone
In this paper, we describe adaptation of a simple word guessing game that occupied the hearts and minds of people around the world.
no code implementations • 7 Sep 2021 • Matīss Rikters, Toshiaki Nakazawa
One of the most popular methods for context-aware machine translation (MT) is to use separate encoders for the source sentence and context as multiple sources for one target sentence.
1 code implementation • 9 Jun 2021 • Maija Kāle, Matīss Rikters
We analysed sentiment and frequencies related to smell, taste and temperature expressed by food tweets in the Latvian language.
1 code implementation • WMT (EMNLP) 2020 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT, which is difficult to 1) train with little amount of DL data; and 2) evaluate, as the main methods and data sets focus on SL evaluation.
1 code implementation • WS 2019 • Matīss Rikters, Ryokan Ri, Tong Li, Toshiaki Nakazawa
While the progress of machine translation of written text has come far in the past several years thanks to the increasing availability of parallel corpora and corpora-based training technologies, automatic translation of spoken text and dialogues remains challenging even for modern systems.
Ranked #1 on
Machine Translation
on Business Scene Dialogue JA-EN
(using extra training data)
1 code implementation • 10 Jul 2020 • Uga Sproģis, Matīss Rikters
We present the Latvian Twitter Eater Corpus - a set of tweets in the narrow domain related to food, drinks, eating and drinking.
Ranked #1 on
Sentiment Analysis
on Latvian Twitter Eater Sentiment Dataset
(using extra training data)
1 code implementation • 19 Oct 2018 • Matīss Rikters
Large parallel corpora that are automatically obtained from the web, documents or elsewhere often exhibit many corrupted parts that are bound to negatively affect the quality of the systems and models that learn from these corpora.
Ranked #1 on
Machine Translation
on WMT 2018 English-Finnish
1 code implementation • 8 Aug 2018 • Matīss Rikters
In this paper, we describe a tool for debugging the output and attention weights of neural machine translation (NMT) systems and for improved estimations of confidence about the output based on the attention.
Ranked #4 on
Machine Translation
on WMT 2017 Latvian-English
1 code implementation • MTSummit 2017 • Matīss Rikters, Ondřej Bojar
Processing of multi-word expressions (MWEs) is a known problem for any natural language processing task.
3 code implementations • MTSummit 2017 • Matīss Rikters, Mark Fishel
Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens.
Ranked #3 on
Machine Translation
on WMT 2017 Latvian-English