Search Results for author: Mark Fishel

Found 38 papers, 6 papers with code

Machine Translation for Livonian: Catering to 20 Speakers

no code implementations ACL 2022 Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, Mark Fishel

Livonian is one of the most endangered languages in Europe with just a tiny handful of speakers and virtually no publicly available corpora.

Cross-Lingual Transfer Machine Translation +2

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

no code implementations WMT (EMNLP) 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Vishrav Chaudhary, Mark Fishel, Francisco Guzmán, Lucia Specia

We explore (a) a black-box approach to QE based on pre-trained representations; and (b) glass-box approaches that leverage various indicators that can be extracted from the neural MT systems.

Sentence Task 2

To Err Is Human, but Llamas Can Learn It Too

no code implementations8 Mar 2024 Agnes Luhtaru, Taido Purason, Martin Vainikko, Maksym Del, Mark Fishel

This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs).

Grammatical Error Correction

Autocorrect for Estonian texts: final report from project EKTB25

no code implementations18 Feb 2024 Agnes Luhtaru, Martin Vainikko, Krista Liin, Kais Allkivi-Metsoja, Jaagup Kippar, Pille Eslon, Mark Fishel

To mitigate this, (1) we annotated more correction data for model training and testing, (2) we tested transfer-learning, i. e. retraining machine learning models created for other tasks, so as not to depend solely on correction data, (3) we compared the developed method and model with alternatives, including large language models.

Language Modelling Transfer Learning

True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4

no code implementations20 Dec 2022 Maksym Del, Mark Fishel

Our work introduces a challenging benchmark for future studies on reasoning in language models and contributes to a better understanding of the limits of LLMs' abilities.

Multiple-choice

Cross-lingual Similarity of Multilingual Representations Revisited

1 code implementation4 Dec 2022 Maksym Del, Mark Fishel

Related works used indexes like CKA and variants of CCA to measure the similarity of cross-lingual representations in multilingual language models.

Causal Language Modeling Language Modelling +1

Translation Transformers Rediscover Inherent Data Domains

1 code implementation WMT (EMNLP) 2021 Maksym Del, Elizaveta Korotkova, Mark Fishel

Here we analyze the sentence representations learned by NMT Transformers and show that these explicitly include the information on text domains, even after only seeing the input sentences without domains labels.

Clustering Domain Adaptation +5

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations21 May 2020 Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation +1

XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings

no code implementations25 Sep 2019 Maksym Del, Mark Fishel

Current state-of-the-art results in multilingual natural language inference (NLI) are based on tuning XLM (a pre-trained polyglot language model) separately for each language involved, resulting in multiple models.

Knowledge Distillation Language Modelling +3

Findings of the WMT 2019 Shared Tasks on Quality Estimation

no code implementations WS 2019 Erick Fonseca, Lisa Yankovskaya, Andr{\'e} F. T. Martins, Mark Fishel, Christian Federmann

We report the results of the WMT19 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems given just the source text and the hypothesis translations.

Machine Translation Sentence +1

Quality Estimation and Translation Metrics via Pre-trained Word and Sentence Embeddings

no code implementations WS 2019 Elizaveta Yankovskaya, Andre T{\"a}ttar, Mark Fishel

We propose the use of pre-trained embeddings as features of a regression model for sentence-level quality estimation of machine translation.

Machine Translation regression +3

University of Tartu's Multilingual Multi-domain WMT19 News Translation Shared Task Submission

no code implementations WS 2019 Andre T{\"a}ttar, Elizaveta Korotkova, Mark Fishel

This paper describes the University of Tartu{'}s submission to the news translation shared task of WMT19, where the core idea was to train a single multilingual system to cover several language pairs of the shared task and submit its results.

Translation

Grammatical Error Correction and Style Transfer via Zero-shot Monolingual Translation

no code implementations27 Mar 2019 Elizaveta Korotkova, Agnes Luhtaru, Maksym Del, Krista Liin, Daiga Deksne, Mark Fishel

Both grammatical error correction and text style transfer can be viewed as monolingual sequence-to-sequence transformation tasks, but the scarcity of directly annotated data for either task makes them unfeasible for most languages.

Grammatical Error Correction Style Transfer +2

Phrase-based Unsupervised Machine Translation with Compositional Phrase Embeddings

no code implementations WS 2018 Maksym Del, Andre T{\"a}ttar, Mark Fishel

This paper describes the University of Tartu{'}s submission to the unsupervised machine translation track of WMT18 news translation shared task.

Translation Unsupervised Machine Translation

Quality Estimation with Force-Decoded Attention and Cross-lingual Embeddings

no code implementations WS 2018 Elizaveta Yankovskaya, Andre T{\"a}ttar, Mark Fishel

This paper describes the submissions of the team from the University of Tartu for the sentence-level Quality Estimation shared task of WMT18.

Machine Translation regression +2

Doubly Attentive Transformer Machine Translation

no code implementations30 Jul 2018 Hasan Sait Arslan, Mark Fishel, Gholamreza Anbarjafari

In this paper a doubly attentive transformer machine translation model (DATNMT) is presented in which a doubly-attentive transformer decoder normally joins spatial visual features obtained via pretrained convolutional neural networks, conquering any gap between image captioning and translation.

Image Captioning Multimodal Machine Translation +1

Multi-Domain Neural Machine Translation

no code implementations6 May 2018 Sander Tars, Mark Fishel

We present an approach to neural machine translation (NMT) that supports multiple domains in a single model and allows switching between the domains when translating.

Machine Translation NMT +1

Confidence through Attention

3 code implementations MTSummit 2017 Matīss Rikters, Mark Fishel

Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens.

Machine Translation Translation

Terra: a Collection of Translation Error-Annotated Corpora

no code implementations LREC 2012 Mark Fishel, Ond{\v{r}}ej Bojar, Maja Popovi{\'c}

Recently the first methods of automatic diagnostics of machine translation have emerged; since this area of research is relatively young, the efforts are not coordinated.

Machine Translation Translation

SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles

no code implementations LREC 2012 Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Andy Way, Panayota Georgakopoulou, Martin Volk

Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of subtitle production process.

Machine Translation Translation

Automatic MT Error Analysis: Hjerson Helping Addicter

no code implementations LREC 2012 Jan Berka, Ond{\v{r}}ej Bojar, Mark Fishel, Maja Popovi{\'c}, Daniel Zeman

We present a complex, open source tool for detailed machine translation error analysis providing the user with automatic error detection and classification, several monolingual alignment algorithms as well as with training and test corpus browsing.

General Classification Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.