Search Results for author: Josef van Genabith

Found 111 papers, 11 papers with code

Translation Quality Estimation by Jointly Learning to Score and Rank

no code implementations EMNLP 2020 Jingyi Zhang, Josef van Genabith

In order to make use of different types of human evaluation data for supervised learning, we present a multi-task learning QE model that jointly learns two tasks: score a translation and rank two translations.

Multi-Task Learning Sentence Embeddings +1

TransIns: Document Translation with Markup Reinsertion

1 code implementation EMNLP (ACL) 2021 Jörg Steffen, Josef van Genabith

This is challenging, as markup can be nested, apply to spans contiguous in source but non-contiguous in target etc.

Document Translation Translation

Tracing Source Language Interference in Translation with Graph-Isomorphism Measures

no code implementations RANLP 2021 Koel Dutta Chowdhury, Cristina España-Bonet, Josef van Genabith

Previous research has used linguistic features to show that translations exhibit traces of source language interference and that phylogenetic trees between languages can be reconstructed from the results of translations into the same language.

Translation

UdS-DFKI@WMT20: Unsupervised MT and Very Low Resource Supervised MT for German-Upper Sorbian

no code implementations WMT (EMNLP) 2020 Sourav Dutta, Jesujoba Alabi, Saptarashmi Bandyopadhyay, Dana Ruiter, Josef van Genabith

This paper describes the UdS-DFKI submission to the shared task for unsupervised machine translation (MT) and very low-resource supervised MT between German (de) and Upper Sorbian (hsb) at the Fifth Conference of Machine Translation (WMT20).

Translation Unsupervised Machine Translation

Exploiting Social Media Content for Self-Supervised Style Transfer

1 code implementation18 May 2022 Dana Ruiter, Thomas Kleinbauer, Cristina España-Bonet, Josef van Genabith, Dietrich Klakow

Recent research on style transfer takes inspiration from unsupervised neural machine translation (UNMT), learning from large amounts of non-parallel data by exploiting cycle consistency loss, back-translation, and denoising autoencoders.

Denoising Machine Translation +2

Towards Debiasing Translation Artifacts

no code implementations16 May 2022 Koel Dutta Chowdhury, Rricha Jalota, Cristina España-Bonet, Josef van Genabith

Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets.

Natural Language Inference Translation

Mid-Air Hand Gestures for Post-Editing of Machine Translation

1 code implementation ACL 2021 Rashad Albo Jamara, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

Here, we present the first study that investigates the usefulness of mid-air hand gestures in combination with the keyboard (GK) for text editing in PE of MT.

Machine Translation Translation

A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

no code implementations ACL 2021 Jingyi Zhang, Josef van Genabith

We further fine-tune the target-to-source attention in the BTBA model to obtain better alignments using a full context based optimization method and self-supervised training.

Machine Translation Translation +1

Multi-Head Highly Parallelized LSTM Decoder for Neural Machine Translation

no code implementations ACL 2021 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Meng Zhang

This has to be computed n times for a sequence of length n. The linear transformations involved in the LSTM gate and state computations are the major cost factors in this.

Machine Translation Translation

Understanding Translationese in Multi-view Embedding Spaces

no code implementations COLING 2020 Koel Dutta Chowdhury, Cristina Espa{\~n}a-Bonet, Josef van Genabith

Recent studies use a combination of lexical and syntactic features to show that footprints of the source language remain visible in translations, to the extent that it is possible to predict the original source language from the translation.

Translation

Linguistically inspired morphological inflection with a sequence to sequence model

no code implementations4 Sep 2020 Eleni Metheniti, Guenter Neumann, Josef van Genabith

Inflection is an essential part of every human language's morphology, yet little effort has been made to unify linguistic theory and computational methods in recent years.

Language Acquisition Morphological Inflection

Transformer with Depth-Wise LSTM

no code implementations13 Jul 2020 Hongfei Xu, Qiuhui Liu, Deyi Xiong, Josef van Genabith

In this paper, we suggest that the residual connection has its drawbacks, and propose to train Transformers with the depth-wise LSTM which regards outputs of layers as steps in time series instead of residual connections, under the motivation that the vanishing gradient problem suffered by deep networks is the same as recurrent networks applied to long sequences, while LSTM (Hochreiter and Schmidhuber, 1997) has been proven of good capability in capturing long-distance relationship, and its design may alleviate some drawbacks of residual connections while ensuring the convergence.

Time Series

MMPE: A Multi-Modal Interface for Post-Editing Machine Translation

no code implementations ACL 2020 Nico Herbig, Tim D{\"u}wel, Santanu Pal, Kalliopi Meladaki, Mahsa Monshizadeh, Antonio Kr{\"u}ger, Josef van Genabith

On the other hand, speech and multi-modal combinations of select {\&} speech are considered suitable for replacements and insertions but offer less potential for deletion and reordering.

Machine Translation Translation

How Human is Machine Translationese? Comparing Human and Machine Translations of Text and Speech

no code implementations WS 2020 Yuri Bizzoni, Tom S Juzek, Cristina Espa{\~n}a-Bonet, Koel Dutta Chowdhury, Josef van Genabith, Elke Teich

Some translationese features tend to appear in simultaneous interpreting with higher frequency than in human text translation, but the reasons for this are unclear.

Machine Translation Translation

MMPE: A Multi-Modal Interface using Handwriting, Touch Reordering, and Speech Commands for Post-Editing Machine Translation

no code implementations ACL 2020 Nico Herbig, Santanu Pal, Tim D{\"u}wel, Kalliopi Meladaki, Mahsa Monshizadeh, Vladislav Hnatovskiy, Antonio Kr{\"u}ger, Josef van Genabith

The shift from traditional translation to post-editing (PE) of machine-translated (MT) text can save time and reduce errors, but it also affects the design of translation interfaces, as the task changes from mainly generating text to correcting errors within otherwise helpful translation proposals.

Machine Translation Translation

Learning Source Phrase Representations for Neural Machine Translation

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu, Jingyi Zhang

Considering that modeling phrases instead of words has significantly improved the Statistical Machine Translation (SMT) approach through the use of larger translation blocks ("phrases") and its reordering ability, modeling NMT at phrase level is an intuitive proposal to help the model capture long-distance relationships.

Machine Translation Translation

Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change

no code implementations ACL 2020 Hongfei Xu, Josef van Genabith, Deyi Xiong, Qiuhui Liu

We propose to automatically and dynamically determine batch sizes by accumulating gradients of mini-batches and performing an optimization step at just the time when the direction of gradients starts to fluctuate.

Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation

no code implementations EMNLP 2020 Dana Ruiter, Josef van Genabith, Cristina España-Bonet

Self-supervised neural machine translation (SSNMT) jointly learns to identify and select suitable training data from comparable (rather than parallel) corpora and to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +1

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers

no code implementations NAACL 2021 Hongfei Xu, Josef van Genabith, Qiuhui Liu, Deyi Xiong

Due to its effectiveness and performance, the Transformer translation model has attracted wide attention, most recently in terms of probing-based approaches.

Translation Word Translation

Lipschitz Constrained Parameter Initialization for Deep Transformers

no code implementations ACL 2020 Hongfei Xu, Qiuhui Liu, Josef van Genabith, Deyi Xiong, Jingyi Zhang

In this paper, we first empirically demonstrate that a simple modification made in the official implementation, which changes the computation order of residual connection and layer normalization, can significantly ease the optimization of deep Transformers.

Translation

Analysing Coreference in Transformer Outputs

no code implementations WS 2019 Ekaterina Lapshinova-Koltunski, Cristina España-Bonet, Josef van Genabith

We analyse coreference phenomena in three neural machine translation systems trained with different data settings with or without access to explicit intra- and cross-sentential anaphoric information.

Machine Translation Translation

Self-Induced Curriculum Learning in Neural Machine Translation

no code implementations25 Sep 2019 Dana Ruiter, Cristina España-Bonet, Josef van Genabith

Self-supervised neural machine translation (SS-NMT) learns how to extract/select suitable training data from comparable (rather than parallel) corpora and how to translate, in a way that the two tasks support each other in a virtuous circle.

Denoising Machine Translation +1

UDS--DFKI Submission to the WMT2019 Similar Language Translation Shared Task

no code implementations16 Aug 2019 Santanu Pal, Marcos Zampieri, Josef van Genabith

The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.

Translation

Improving CAT Tools in the Translation Workflow: New Approaches and Evaluation

no code implementations WS 2019 Mihaela Vela, Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Josef van Genabith

User feedback revealed that the users preferred using CATaLog Online over existing CAT tools in some respects, especially by selecting the output of the MT system and taking advantage of the color scheme for TM suggestions.

Automatic Post-Editing Translation

The Transference Architecture for Automatic Post-Editing

no code implementations COLING 2020 Santanu Pal, Hongfei Xu, Nico Herbig, Sudip Kumar Naskar, Antonio Krueger, Josef van Genabith

In automatic post-editing (APE) it makes sense to condition post-editing (pe) decisions on both the source (src) and the machine translated text (mt) as input.

Automatic Post-Editing

USAAR-DFKI -- The Transference Architecture for English--German Automatic Post-Editing

no code implementations WS 2019 Santanu Pal, Hongfei Xu, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

In this paper we present an English{--}German Automatic Post-Editing (APE) system called transference, submitted to the APE Task organized at WMT 2019.

Automatic Post-Editing Translation

UDS--DFKI Submission to the WMT2019 Czech--Polish Similar Language Translation Shared Task

no code implementations WS 2019 Santanu Pal, Marcos Zampieri, Josef van Genabith

The first edition of this shared task featured data from three pairs of similar languages: Czech and Polish, Hindi and Nepali, and Portuguese and Spanish.

Translation

JU-Saarland Submission to the WMT2019 English--Gujarati Translation Shared Task

no code implementations WS 2019 Riktim Mondal, Shankha Raj Nayek, Aditya Chowdhury, Santanu Pal, Sudip Kumar Naskar, Josef van Genabith

In this paper we describe our joint submission (JU-Saarland) from Jadavpur University and Saarland University in the WMT 2019 news translation shared task for English{--}Gujarati language pair within the translation task sub-track.

Machine Translation Translation

Self-Supervised Neural Machine Translation

1 code implementation ACL 2019 Dana Ruiter, Cristina Espa{\~n}a-Bonet, Josef van Genabith

We present a simple new method where an emergent NMT system is used for simultaneously selecting training data and learning internal NMT representations.

Machine Translation Translation

Integrating Artificial and Human Intelligence for Efficient Translation

no code implementations7 Mar 2019 Nico Herbig, Santanu Pal, Josef van Genabith, Antonio Krüger

Current advances in machine translation increase the need for translators to switch from traditional translation to post-editing of machine-translated text, a process that saves time and improves quality.

Machine Translation Translation

INFODENS: An Open-source Framework for Learning Text Representations

1 code implementation16 Oct 2018 Ahmad Taie, Raphael Rubino, Josef van Genabith

The advent of representation learning methods enabled large performance gains on various language tasks, alleviating the need for manual feature engineering.

Feature Engineering General Classification +2

A Transformer-Based Multi-Source Automatic Post-Editing System

no code implementations WS 2018 Santanu Pal, Nico Herbig, Antonio Kr{\"u}ger, Josef van Genabith

The proposed model is an extension of the transformer architecture: two separate self-attention-based encoders encode the machine translation output (mt) and the source (src), followed by a joint encoder that attends over a combination of these two encoded sequences (encsrc and encmt) for generating the post-edited sentence.

Automatic Post-Editing Translation

Code-Mixed Question Answering Challenge: Crowd-sourcing Data and Techniques

no code implementations WS 2018 Ch, Khyathi u, Ekaterina Loginova, Vishal Gupta, Josef van Genabith, G{\"u}nter Neumann, Manoj Chinnakotla, Eric Nyberg, Alan W. black

As a first step towards fostering research which supports CM in NLP applications, we systematically crowd-sourced and curated an evaluation dataset for factoid question answering in three CM languages - Hinglish (Hindi+English), Tenglish (Telugu+English) and Tamlish (Tamil+English) which belong to two language families (Indo-Aryan and Dravidian).

Question Answering

The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction

no code implementations WS 2017 Fraser Bowen, Jon Dehdari, Josef van Genabith

In this research we investigate the impact of mismatches in the density and type of error between training and test data on a neural system correcting preposition and determiner errors.

Grammatical Error Correction Machine Translation

Predicting the Law Area and Decisions of French Supreme Court Cases

no code implementations RANLP 2017 Octavia-Maria Sulea, Marcos Zampieri, Mihaela Vela, Josef van Genabith

In this paper, we investigate the application of text classification methods to predict the law area and the decision of cases judged by the French Supreme Court.

General Classification Text Classification

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

1 code implementation WS 2017 Ben Peters, Jon Dehdari, Josef van Genabith

Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems.

Automatic Speech Recognition

Neural Automatic Post-Editing Using Prior Alignment and Reranking

no code implementations EACL 2017 Santanu Pal, Sudip Kumar Naskar, Mihaela Vela, Qun Liu, Josef van Genabith

APE translations produced by our system show statistically significant improvements over the first-stage MT, phrase-based APE and the best reported score on the WMT 2016 APE dataset by a previous neural APE system.

Automatic Post-Editing Re-Ranking +1

Modeling Diachronic Change in Scientific Writing with Information Density

no code implementations COLING 2016 Raphael Rubino, Stefania Degaetano-Ortlieb, Elke Teich, Josef van Genabith

In this paper we investigate the introduction of information theory inspired features to study long term diachronic change on three levels: lexis, part-of-speech and syntax.

General Classification Informativeness

Multi-Engine and Multi-Alignment Based Automatic Post-Editing and its Impact on Translation Productivity

no code implementations COLING 2016 Santanu Pal, Sudip Kumar Naskar, Josef van Genabith

In the paper we show that parallel system combination in the APE stage of a sequential MT-APE combination yields substantial translation improvements both measured in terms of automatic evaluation metrics as well as in terms of productivity improvements measured in a post-editing experiment.

Automatic Post-Editing Translation

Neural Morphological Tagging from Characters for Morphologically Rich Languages

no code implementations21 Jun 2016 Georg Heigold, Guenter Neumann, Josef van Genabith

We systematically explore a variety of neural architectures (DNN, CNN, CNNHighway, LSTM, BLSTM) to obtain character-based word vectors combined with bidirectional LSTMs to model across-word context in an end-to-end setting.

Morphological Tagging TAG +1

CATaLog Online: Porting a Post-editing Tool to the Web

no code implementations LREC 2016 Santanu Pal, Marcos Zampieri, Sudip Kumar Naskar, Tapas Nayak, Mihaela Vela, Josef van Genabith

The tool features a number of editing and log functions similar to the desktop version of CATaLog enhanced with several new features that we describe in detail in this paper.

Machine Translation Translation

Irish Treebanking and Parsing: A Preliminary Evaluation

no code implementations LREC 2012 Teresa Lynn, {\"O}zlem {\c{C}}etino{\u{g}}lu, Jennifer Foster, Elaine U{\'\i} Dhonnchadha, Mark Dras, Josef van Genabith

This paper describes the early stages in the development of new language resources for Irish ― namely the first Irish dependency treebank and the first Irish statistical dependency parser.

Machine Translation POS

The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation

no code implementations LREC 2012 Christian Federmann, Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Josef van Genabith, Maite Melero, Pavel Pecina

We describe the “Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation” (ML4HMT) which aims to foster research on improved system combination approaches for machine translation (MT).

Language Modelling Machine Translation +1

Automatic Extraction and Evaluation of Arabic LFG Resources

no code implementations LREC 2012 Mohammed Attia, Khaled Shaalan, Lamia Tounsi, Josef van Genabith

We utilize this annotation to automatically acquire grammatical function (dependency) based subcategorization frames and paths linking long-distance dependencies (LDDs).

POS

Arabic Word Generation and Modelling for Spell Checking

no code implementations LREC 2012 Khaled Shaalan, Mohammed Attia, Pavel Pecina, Younes Samih, Josef van Genabith

Furthermore, from a large list of valid forms and invalid forms we create a character-based tri-gram language model to approximate knowledge about permissible character clusters in Arabic, creating a novel method for detecting spelling errors.

Language Modelling Morphological Analysis +1

A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation

no code implementations LREC 2012 Eleftherios Avramidis, Marta R. Costa-juss{\`a}, Christian Federmann, Josef van Genabith, Maite Melero, Pavel Pecina

This corpus aims to serve as a basic resource for further research on whether hybrid machine translation algorithms and system combination techniques can benefit from additional (linguistically motivated, decoding, and runtime) information provided by the different systems involved.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.