Search Results for author: Mark Fishel

Found 39 papers, 6 papers with code

Direct Exploitation of Attention Weights for Translation Quality Estimation

no code implementations • WMT (EMNLP) 2021 • Lisa Yankovskaya, Mark Fishel

The paper presents our submission to the WMT2021 Shared Task on Quality Estimation (QE).

Paper
Add Code

National Language Technology Platform for Public Administration

no code implementations • TDLE (LREC) 2022 • Marko Tadić, Daša Farkaš, Matea Filko, Artūrs Vasiļevskis, Andrejs Vasiļjevs, Jānis Ziediņš, Željka Motika, Mark Fishel, Hrafn Loftsson, Jón Guðnason, Claudia Borg, Keith Cortis, Judie Attard, Donatienne Spiteri

This article presents the work in progress on the collaborative project of several European countries to develop National Language Technology Platform (NLTP).

Paper
Add Code

National Language Technology Platform (NLTP): overall view

no code implementations • EAMT 2022 • Artūrs Vasiļevskis, Jānis Ziediņš, Marko Tadić, None Željka Motika, Mark Fishel, Hrafn Loftsson, Jón Gu, Claudia Borg, Keith Cortis, Judie Attard, Donatienne Spiteri

The work in progress on the CEF Action National Language Technology Platform (NLTP) is presented.

Paper
Add Code

MTee: Open Machine Translation Platform for Estonian Government

no code implementations • EAMT 2022 • Toms Bergmanis, Marcis Pinnis, Roberts Rozis, Jānis Šlapiņš, Valters Šics, Berta Bernāne, Guntars Pužulis, Endijs Titomers, Andre Tättar, Taido Purason, Hele-Andra Kuulmets, Agnes Luhtaru, Liisa Rätsep, Maali Tars, Annika Laumets-Tättar, Mark Fishel

We present the MTee project - a research initiative funded via an Estonian public procurement to develop machine translation technology that is open-source and free of charge.

Document Translation Grammatical Error Correction +2

Paper
Add Code

Machine Translation for Livonian: Catering to 20 Speakers

no code implementations • ACL 2022 • Matīss Rikters, Marili Tomingas, Tuuli Tuisk, Valts Ernštreits, Mark Fishel

Livonian is one of the most endangered languages in Europe with just a tiny handful of speakers and virtually no publicly available corpora.

Cross-Lingual Transfer Machine Translation +2

Paper
Add Code

BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task

no code implementations • WMT (EMNLP) 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Vishrav Chaudhary, Mark Fishel, Francisco Guzmán, Lucia Specia

We explore (a) a black-box approach to QE based on pre-trained representations; and (b) glass-box approaches that leverage various indicators that can be extracted from the neural MT systems.

Sentence Task 2

Paper
Add Code

Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer

no code implementations • 5 Apr 2024 • Hele-Andra Kuulmets, Taido Purason, Agnes Luhtaru, Mark Fishel

This paper explores cost-efficient methods to adapt pretrained Large Language Models (LLMs) to new lower-resource languages, with a specific focus on Estonian.

Instruction Following Transfer Learning

Paper
Add Code

To Err Is Human, but Llamas Can Learn It Too

no code implementations • 8 Mar 2024 • Agnes Luhtaru, Taido Purason, Martin Vainikko, Maksym Del, Mark Fishel

This study explores enhancing grammatical error correction (GEC) through artificial error generation (AEG) using language models (LMs).

Grammatical Error Correction

Paper
Add Code

Autocorrect for Estonian texts: final report from project EKTB25

no code implementations • 18 Feb 2024 • Agnes Luhtaru, Martin Vainikko, Krista Liin, Kais Allkivi-Metsoja, Jaagup Kippar, Pille Eslon, Mark Fishel

To mitigate this, (1) we annotated more correction data for model training and testing, (2) we tested transfer-learning, i. e. retraining machine learning models created for other tasks, so as not to depend solely on correction data, (3) we compared the developed method and model with alternatives, including large language models.

Language Modelling Transfer Learning

Paper
Add Code

True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4

no code implementations • 20 Dec 2022 • Maksym Del, Mark Fishel

Our work introduces a challenging benchmark for future studies on reasoning in language models and contributes to a better understanding of the limits of LLMs' abilities.

Multiple-choice

Paper
Add Code

Cross-lingual Similarity of Multilingual Representations Revisited

1 code implementation • 4 Dec 2022 • Maksym Del, Mark Fishel

Related works used indexes like CKA and variants of CCA to measure the similarity of cross-lingual representations in multilingual language models.

Causal Language Modeling Language Modelling +1

Paper
Code

Translation Transformers Rediscover Inherent Data Domains

1 code implementation • WMT (EMNLP) 2021 • Maksym Del, Elizaveta Korotkova, Mark Fishel

Here we analyze the sentence representations learned by NMT Transformers and show that these explicitly include the information on text domains, even after only seeing the input sentences without domains labels.

Clustering Domain Adaptation +5

Paper
Code

Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages

1 code implementation • 2 Sep 2021 • Maksym Del, Mark Fishel

However, we observe that Baltic languages do belong to that shared space.

Sentence Zero-Shot Cross-Lingual Transfer

Paper
Code

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations • 21 May 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation +1

29,251

Paper
Code

XD: Cross-lingual Knowledge Distillation for Polyglot Sentence Embeddings

no code implementations • 25 Sep 2019 • Maksym Del, Mark Fishel

Current state-of-the-art results in multilingual natural language inference (NLI) are based on tuning XLM (a pre-trained polyglot language model) separately for each language involved, resulting in multiple models.

Knowledge Distillation Language Modelling +3

Paper
Add Code

Quality Estimation and Translation Metrics via Pre-trained Word and Sentence Embeddings

no code implementations • WS 2019 • Elizaveta Yankovskaya, Andre T{\"a}ttar, Mark Fishel

We propose the use of pre-trained embeddings as features of a regression model for sentence-level quality estimation of machine translation.

Machine Translation regression +3

Paper
Add Code

University of Tartu's Multilingual Multi-domain WMT19 News Translation Shared Task Submission

no code implementations • WS 2019 • Andre T{\"a}ttar, Elizaveta Korotkova, Mark Fishel

This paper describes the University of Tartu{'}s submission to the news translation shared task of WMT19, where the core idea was to train a single multilingual system to cover several language pairs of the shared task and submit its results.

Translation

Paper
Add Code

Findings of the WMT 2019 Shared Tasks on Quality Estimation

no code implementations • WS 2019 • Erick Fonseca, Lisa Yankovskaya, Andr{\'e} F. T. Martins, Mark Fishel, Christian Federmann

We report the results of the WMT19 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems given just the source text and the hypothesis translations.

Machine Translation Sentence +1

Paper
Add Code

Findings of the 2019 Conference on Machine Translation (WMT19)

no code implementations • WS 2019 • Lo{\"\i}c Barrault, Ond{\v{r}}ej Bojar, Marta R. Costa-juss{\`a}, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Philipp Koehn, Shervin Malmasi, Christof Monz, Mathias M{\"u}ller, Santanu Pal, Matt Post, Marcos Zampieri

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019.

Machine Translation Translation

Paper
Add Code

Grammatical Error Correction and Style Transfer via Zero-shot Monolingual Translation

no code implementations • 27 Mar 2019 • Elizaveta Korotkova, Agnes Luhtaru, Maksym Del, Krista Liin, Daiga Deksne, Mark Fishel

Both grammatical error correction and text style transfer can be viewed as monolingual sequence-to-sequence transformation tasks, but the scarcity of directly annotated data for either task makes them unfeasible for most languages.

Grammatical Error Correction Style Transfer +2

Paper
Add Code

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

no code implementations • EMNLP 2018 • Ond{\v{r}}ej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor

Machine Translation Translation

Paper
Add Code

Phrase-based Unsupervised Machine Translation with Compositional Phrase Embeddings

no code implementations • WS 2018 • Maksym Del, Andre T{\"a}ttar, Mark Fishel

This paper describes the University of Tartu{'}s submission to the unsupervised machine translation track of WMT18 news translation shared task.

Translation Unsupervised Machine Translation

Paper
Add Code

Quality Estimation with Force-Decoded Attention and Cross-lingual Embeddings

no code implementations • WS 2018 • Elizaveta Yankovskaya, Andre T{\"a}ttar, Mark Fishel

This paper describes the submissions of the team from the University of Tartu for the sentence-level Quality Estimation shared task of WMT18.

Machine Translation regression +2

Paper
Add Code

Findings of the 2018 Conference on Machine Translation (WMT18)

no code implementations • WS 2018 • Ond{\v{r}}ej Bojar, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Philipp Koehn, Christof Monz

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2018.

Automatic Post-Editing Multimodal Machine Translation +1

Paper
Add Code

Monolingual and Cross-lingual Zero-shot Style Transfer

no code implementations • 1 Aug 2018 • Elizaveta Korotkova, Maksym Del, Mark Fishel

We introduce the task of zero-shot style transfer between different languages.

Machine Translation Style Transfer +1

Paper
Add Code

Doubly Attentive Transformer Machine Translation

no code implementations • 30 Jul 2018 • Hasan Sait Arslan, Mark Fishel, Gholamreza Anbarjafari

In this paper a doubly attentive transformer machine translation model (DATNMT) is presented in which a doubly-attentive transformer decoder normally joins spatial visual features obtained via pretrained convolutional neural networks, conquering any gap between image captioning and translation.

Image Captioning Multimodal Machine Translation +1

Paper
Add Code

Multi-Domain Neural Machine Translation

no code implementations • 6 May 2018 • Sander Tars, Mark Fishel

We present an approach to neural machine translation (NMT) that supports multiple domains in a single model and allows switching between the domains when translating.

Machine Translation NMT +1

Paper
Add Code

Confidence through Attention

3 code implementations • MTSummit 2017 • Matīss Rikters, Mark Fishel

Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens.

Ranked #3 on Machine Translation on WMT 2017 Latvian-English

Machine Translation Translation

Paper
Code

bleu2vec: the Painfully Familiar Metric on Continuous Vector Space Steroids

no code implementations • WS 2017 • Andre T{\"a}ttar, Mark Fishel

Lemmatization Machine Translation

Paper
Add Code

C-3MA: Tartu-Riga-Zurich Translation Systems for WMT17

1 code implementation • WS 2017 • Mat{\=\i}ss Rikters, Chantal Amrhein, Maksym Del, Mark Fishel

Machine Translation Translation +1

Paper
Code

Detecting Document-level Context Triggers to Resolve Translation Ambiguity

no code implementations • WS 2015 • Laura Mascarell, Mark Fishel, Martin Volk

Machine Translation Translation

Paper
Add Code

Leveraging Compounds to Improve Noun Phrase Translation from Chinese and German

no code implementations • IJCNLP 2015 • Xiao Pu, Laura Mascarell, Andrei Popescu-Belis, Mark Fishel, Ngoc-Quang Luong, Martin Volk

Machine Translation Translation

Paper
Add Code

Machine Translation for Subtitling: A Large-Scale Evaluation

no code implementations • LREC 2014 • Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard van Loenhout, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Anja Turner, Martin Volk

This article describes a large-scale evaluation of the use of Statistical Machine Translation for professional subtitling.

Language Modelling Machine Translation +1

Paper
Add Code

Ranking Translations using Error Analysis and Quality Estimation

no code implementations • WS 2013 • Mark Fishel

Machine Translation

Paper
Add Code

Combining Statistical Machine Translation and Translation Memories with Domain Adaptation

no code implementations • WS 2013 • Samuel L{\"a}ubli, Mark Fishel, Martin Volk, Manuela Weibel

Domain Adaptation Machine Translation +1

Paper
Add Code

TerrorCat: a Translation Error Categorization-based MT Quality Metric

no code implementations • WS 2012 • Mark Fishel, Rico Sennrich, Maja Popovi{\'c}, Ond{\v{r}}ej Bojar

Machine Translation Translation

Paper
Add Code

SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles

no code implementations • LREC 2012 • Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Andy Way, Panayota Georgakopoulou, Martin Volk

Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of subtitle production process.

Machine Translation Translation

Paper
Add Code

Automatic MT Error Analysis: Hjerson Helping Addicter

no code implementations • LREC 2012 • Jan Berka, Ond{\v{r}}ej Bojar, Mark Fishel, Maja Popovi{\'c}, Daniel Zeman

We present a complex, open source tool for detailed machine translation error analysis providing the user with automatic error detection and classification, several monolingual alignment algorithms as well as with training and test corpus browsing.

General Classification Machine Translation +1

Paper
Add Code

Terra: a Collection of Translation Error-Annotated Corpora

no code implementations • LREC 2012 • Mark Fishel, Ond{\v{r}}ej Bojar, Maja Popovi{\'c}

Recently the first methods of automatic diagnostics of machine translation have emerged; since this area of research is relatively young, the efforts are not coordinated.

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.